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ABSTRACT 

We  have  developed  a compiler  for  the  lexically-scoped  dialect  of  LISP 
known  as  SCHEME.  The  compiler  knows  relatively  little  about  specific  data 
manipulation  primitives  such  as  arithmetic  operators,  but  concentrates  on  general 
issues  of  environment  and  control.  Rather  than  having  specialized  knowledge 
about  a large  variety  of  control  and  environment  constructs,  the  compiler  handles 
only  a small  basis  set  which  reflects  the  semantics  of  lambda-calculus.  All  of 
the  traditional  imperative  constructs,  such  as  sequencing,  assignment,  looping, 
GOTO,  as  well  as  many  standard  LISP  constructs  such  as  AND,  OR,  and  COND,  are 
expressed  as  macros  in  terms  of  the  applicative  basis  set.  A small  number  of 
optimization  techniques,  coupled  with  the  treatment  of  function  calls  as  GOTO 
statements,  serve  to  produce  code  as  good  as  that  produced  by  more  traditional 
compilers.  The  macro  approach  enables  speedy  implementation  of  new  constructs  as 
desired  without  sacrificing  efficiency  in  the  generated  code.p^. 

A fair  amount  of  analysis  is  devoted  to  determining  whether  environments, 
may  be  stack-allocated  or  must  be  heap-allocated.  Heap-allocated  environments 
are  necessary  in  general  because  SCHEME  (unlike  Algol  60  and  Algol  68,  for 

example)  allows  procedures  with  free  lexically  scoped  variables  to  be  returned  as 
the  values  of  other  procedures;  the  Algol  stack-allocation  environment  strategy 
does  not  suffice.  The  methods  used  here  indicate  that  a heap-allocating 
generalization  of  the  "display"  technique  leads  to  an  efficient  implementation  of 
such  "upward  funargs".  Moreover,  compile-time  optimization  and  analysis  can 

eliminate  many  "funargs"  entirely,  and  so  far  fewer  environment  structures  need 
be  allocated  at  run  time  than  might  be  expected. 

A subset  of  SCHEME  (rather  than  triples,  for  example)  serves  as  the 

representation  intermediate  between  the  optimized  SCHEME  code  and  the  final 
output  code;  code  is  expressed  in  this  subset  in  the  so-called  continuation- 
passing style.  As  a subset  of  SCHEME,  it  enjoys  the  same  theoretical  properties; 
one  could  even  apply  the  same  optimizer  used  on  the  input  code  to  the 

intermediate  code.  However,  the  subset  is  so  chosen  that  all  temporary 
quantities  are  made  manifest  as  variables,  and  no  control  stack  is  needed  to 
evaluate  it.  As  a result,  this  apparently  applicative  representation  admits  an 
imperative  interpretation  which  permits  easy  transcription  to  final  imperative 
machine  code.  These  qualities  suggest  that  an  applicative  language  like  SCHEME 
is  a better  candidate  for  an  UNCOL  than  the  more  imperative  candidates  proposed 
to  date. 
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Note 


The  first  part  of  this  report  is  a slightly  revised  version  of  a 
dissertation  submitted  in  May  1977.  Where  it  was  of  historical  interest  to 
reflect  changes  in  the  SCHEME  language  which  ocurred  in  the  following  year  and 
the  effect  they  had  on  RABBIT,  the  text  was  left  intact,  with  notes  added  of  the 
form,  "Since  the  dissertation  was  written,  thus-and-so  occurred."  The  second 
part,  the  Appendix,  was  not  part  of  the  dissertation,  and  is  a complete  listing 
of  the  source  code  for  RABBIT,  with  extensive  commentary. 

It  is  intended  that  the  first  part  should  be  self-contained,  and  provide 
a qualitative  overview  of  the  compilation  methods  used  in  RABBIT.  The  second 
part  is  provided  for  those  readers  who  would  like  to  examine  the  precise 
mechanisms  used  to  carry  out  the  general  methods. 

Thus  there  are  five  levels  of  thoroughness  at  which  ct.’J  reader  may 
consume  this  document: 

(1)  The  reader  who  wishes  only  to  skim  is  advised  to  read  sections  1,  5,  6, 
possibly  7,  8A,  8B,  8C,  10,  11,  and  1Z.  This  will  give  a basic  overview, 
including  the  use  of  macros  and  the  optimizing  techniques. 

(Z)  The  reader  who  also  wants  to  know  about  the  details  of  SCHEME,  the  run-time 
system,  and  a long  example  is  advised  to  read  the  entire  main  text  (about  a third 
of  the  document). 

(3)  The  reader  who  wants  to  understand  the  low-level  organization  of  the 
algorithms,  and  read  about  the  more  tricky  special  cases,  should  read  the  main 
text  and  then  the  commentary  on  the  code. 

(4)  The  reader  who  additionally  wants  to  understand  the  nit-picking  details 
should  read  the  code  along  with  the  commentary. 

(5)  The  reader  who  wants  a real  feel  for  the  techniques  involved  should  read  the 
entire  document,  invent  three  new  SCHEME  constructs  and  write  macros  for  them, 
and  then  reimplement  the  compiler  for  another  run-time  environment.  (He  ought 
please  also  to  send  a copy  of  any  documents  on  such  a project  to  this  author,  who 
would  be  very  interested!) 
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1.  Introduction 

The  work  described  here  is  a continuation  (!)  of  that  described  in 
[SCHEME],  [Imperative],  and  [Declarative].  Before  enumerating  the  points  of  the 
thesis,  we  summarize  here  each  of  these  documents. 


A.  Background 

In  [SCHEME]  we  (Gerald  Jay  Sussman  and  the  author)  described  the 
implementation  of  a dialect  of  LISP  named  SCHEME  with  the  properties  of  lexical 
scoping  and  tail-recursion;  this  implementation  is  embedded  within  MacLISP 

[Moon],  a version  of  LISP  which  does  not  have  these  properties.  The  property  of 
lexical  scoping  (that  a variable  can  be  referenced  only  from  points  textually 
within  the  expression  which  binds  it)  is  a consequence  of  the  fact  that  all 
functions  are  closed  in  the  "binding  environment".  [Noses]  That  is,  SCHEME  is  a 
"full-funarg"  LISP  dialect.  (Note  Full-Funarg  Example)  The  property  of  tail- 
recursion  implies  that  loops  written  in  an  apparently  recursive  form  will 
actually  be  executed  in  an  iterative  fashion.  Intuitively,  function  calls  do  not 
"push  control  stack";  instead,  it  is  argument  evaluation  which  pushes  control 
stack.  The  two  properties  of  lexical  scoping  and  tail-recursion  are  not 

independent.  In  most  LISP  systems  [LISP1.5N]  [Noon]  [Teitelman],  which  use 
dynamic  scoping  rather  than  lexical,  tail-recursion  is  impossible  because 
function  calls  must  push  control  stack  in  order  to  be  able  to  undo  the  dynamic 

bindings  after  the  return  of  the  function.  On  the  other  hand,  it  is  possible  to 

have  a lexically  scoped  LISP  which  does  not  tail-recurse,  but  it  is  easily  seen 
that  such  an  implementation  only  wastes  storage  space  needlessly  compared  to  a 
tail-recursing  implementation.  [Steele]  Together,  these  two  properties  cause 
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SCHEME  to  reflect  lambda-calculus  semantics  much  more  closely  than  dynamically 
scoped  LISP  systems.  SCHEME  also  permits  the  treatment  of  functions  as  full- 
fledged  data  objects;  they  may  be  passed  as  arguments,  returned  as  values,  made 
part  of  composite  data  structures,  and  notated  as  independent,  unnamed 
("anonymous")  entities.  (Contrast  this  with  most  ALGOL-like  languages,  in  which 
a function  can  be  written  only  by  declaring  it  and  giving  it  a name;  imagine 
being  able  to  use  an  integer  value  only  by  giving  it  a name  in  a declaration!) 
The  property  of  lexical  scoping  allows  this  to  be  done  in  a consistent  manner 
without  the  possibility  of  identifier  conflicts  (that  is,  SCHEME  "solves  the 
FUNARG  problem"  [Moses]).  In  [SCHEME]  we  also  discussed  the  technique  of 
"continuation-passing  style",  a way  of  writing  programs  in  SCHEME  such  that  no 
function  ever  returns  a value. 

In  [Imperative]  we  explored  ways  of  exploiting  these  properties  to 
implement  most  traditional  programming  constructs,  such  as  assignment,  looping, 
and  call-by-name,  in  terms  of  function  application.  Such  applicative  (lambda- 
calculus)  models  of  programming  language  constructs  are  well-known  to 
theoreticians  (see  [Stoy],  for  example),  but  have  not  been  used  in  a practical 
programming  system.  All  of  these  constructs  are  actually  made  available  in 
SCHEME  by  macros  which  expand  into  these  applicative  definitions.  This  technique 
has  permitted  the  speedy  implementation  of  a rich  user-level  language  in  terms  of 
a very  small,  easy-to-implement  basis  set  of  primitive  constructs.  In 
[Imperative]  we  continued  the  exploration  of  continuation-passing  style,  and 
noted  that  the  escape  operator  CATCH  is  easily  modelled  by  transforming  a program 
into  this  style.  We  also  pointed  out  that  transforming  a program  into  this  style 
enforces  a particular  order  of  argument  evaluation,  and  makes  all  intermediate 
computational  quantities  manifest  as  variables. 


In  [Declarative]  we  examined  more  closely  the  issue  of  tail-recursion 
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and  demonstrated  that  the  usual  view  of  function  calls  as  pushing  a return 
address  must  lead  to  an  either  inefficient  or  inconsistent  implementation,  while 
the  tail-recursive  approach  of  SCHEME  leads  to  a uniform  discipline  in  which 
function  calls  are  treated  as  GOTO  statements  which  also  pass  arguments.  We  also 
noted  that  a consequence  of  lexical  scoping  is  that  the  only  code  which  can 
reference  the  value  of  a variable  in  a given  environment  is  code  which  is  closed 
in  that  environment  or  which  receives  the  value  as  an  argument;  this  in  turn 
implies  that  a compiler  can  structure  a run-time  environment  in  any  arbitrary 
fashion,  because  it  will  compile  all  the  code  which  can  reference  that 
environment,  and  so  can  arrange  for  that  code  to  reference  it  in  the  appropriate 
manner.  Such  references  do  not  require  any  kind  of  search  (as  is  commonly  and 
incorrectly  believed  in  the  LISP  community  because  of  early  experience  with  LISP 
interpreters  which  search  a-lists)  because  the  compiler  can  determine  the  precise 
location  of  each  variable  in  an  environment  at  compile  time.  It  is  not  necessary 
to  use  a standard  format,  because  neither  interpreted  code  nor  other  compiled 
code  can  refer  to  that  environment.  (This  is  to  be  constrasted  with  "spaghetti 
stacks"  [Bobrow  and  Wegbreit].)  In  [Declarative]  we  also  carried  on  the  analysis 
of  continuation-passing  style,  and  noted  that  transforming  a program  into  this 
style  elucidates  traditional  compilation  issues  such  as  register  allocation 
because  user  variables  and  intermediate  quantities  alike  are  made  manifest  as 
variables  on  an  equal  footing.  Appendix  A of  [Declarative]  contained  an 
algorithm  for  converting  any  SCHEME  program  (not  containing  ASET)  to 
continuation-passing  style. 

We  have  implemented  two  compilers  for  the  language  SCHEME.  The  purpose 
was  to  explore  compilation  techniques  for  a language  modelled  on  lambda-calculus, 
using  lambda-calculus-style  models  of  imperative  programming  constructs.  Both 
compilers  use  the  strategy  of  converting  the  source  program  to  continuation- 
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passing  style. 

The  first  compiler  (known  as  CHEAPY)  was  written  as  a throw-away 
implementation  to  test  the  concept  of  conversion  to  continuation-passing  style. 

The  first  half  of  CHEAPY  is  essentially  the  algorithm  which  appears  in  Appendix  A 
of  [Declarative],  and  the  second  is  a simple  code  generator  with  almost  no 
optimization.  In  conjunction  with  the  writing  of  CHEAPY,  the  SCHEME  interpreter 
was  modified  to  interface  to  compiled  functions.  (This  interface  is  described 
later  in  this  report.) 

The  second  compiler,  with  which  we  are  primarily  concerned  here,  is  known  \ 

as  RABBIT.  It,  like  CHEAPY,  is  written  almost  entirely  in  SCHEME  (with  minor 

exceptions  due  only  to  problems  in  interfacing  with  certain  MacLISP  I/O  j 

facilities).  Unlike  CHEAPY,  it  is  fairly  clever.  It  is  intended  to  demonstrate 

i 

a number  of  optimization  techniques  relevant  to  lexical  environments  and  tail- 
recursive  control  structures.  (The  code  for  RABBIT,  with  commentary,  appears  in 
the  Appendix.) 

B.  The  Thesis 

(1)  Function  calls  are  not  expensive  when  compiled  correctly;  they  should  be 
thought  of  as  GOTO  statements  that  happen  to  pass  arguments. 

(2)  The  combination  of  cheap  function  calls,  lexical  scoping,  tail-recursion, 
and  "anonymous”  notation  of  functions  (which  are  not  independent  properties 
of  a language,  but  aspects  of  a single  unified  approach)  permits  the 
definition  of  a wide  variety  of  "imperative"  constructs  in  applicative 
terms.  Because  these  properties  result  from  adhering  to  the  principles  of 
the  well-known  lambda-calculus  [Church],  such  definitions  can  be  lifted 
intact  from  existing  literature  and  used  directly. 
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(3)  A macro  facility  (the  ability  to  specify  syntactic  transformations)  makes  it 
practical  to  use  these  as  the  only  definitions  of  imperative  constructs  in  a 
programming  system.  Such  a facility  makes  it  extremely  easy  to  define  new 
constructs . 

(4)  A few  well-chosen  optimization  strategies  enable  the  compilation  of  these 
applicative  definitions  into  the  imperative  low-level  code  which  one  would 
expect  from  a traditional  compiler. 

(5)  The  macro  facility  and  the  optimization  techniques  used  by  the  compiler  can 
be  conceptually  unified.  The  same  properties  which  make  it  easy  to  write 
the  macros  make  it  easy  to  define  optimizations  correctly.  In  the  same  way 
that  many  programming  constructs  are  defined  in  terms  of  a small,  well- 
chosen  basis  set,  so  a large  number  of  traditional  optimization  techniques 
fall  out  as  special  cases  of  the  few  used  in  RABBIT.  This  is  no  accident. 
The  separate  treatment  of  a large  and  diverse  set  of  constructs  necessitates 
separate  optimization  techniques  for  each.  As  the  basis  set  of  constructs 
is  reduced,  so  is  the  set  of  interesting  transformations.  If  the  basis  set 
is  properly  chosen,  their  combined  effect  is  "multiplicative"  rather  than 
"additive". 

(6)  The  technique  of  compiling  by  converting  to  continuation-passing  style 
elucidates  some  important  compilation  issues  in  a natural  way.  Intermediate 
quantities  are  made  manifest;  so  is  the  precise  order  of  evaluation. 
Moreover,  this  is  all  expressed  in  a language  isomorphic  to  a subset  of  the 
source  language  SCHEME;  as  a result  the  continuation-passing  style  version 
of  a program  inherits  many  of  the  philosophical  and  practical  advantages. 
For  example,  the  same  optimization  techniques  can  be  applied  at  this  level 
as  at  the  original  source  level.  While  the  use  of  continuation-passing 
style  may  not  make  the  decisions  any  easier,  it  provides  an  effective  and 
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natural  way  to  express  the  results  of  those  decisions. 

(7)  Continuation-passing  style,  while  apparently  applicative  in  nature,  admits  a 
peculiarly  imperative  interpretation  as  a consequence  of  the  facts  that  it 
requires  no  control  stack  to  be  evaluated  and  that  no  functions  ever  return 
values.  As  a result,  it  is  easily  converted  to  an  imperative  machine 
language. 

(8)  A SCHEME  compiler  should  ideally  be  a designer  of  good  data  structures, 
since  it  may  choose  any  representation  whatsoever  for  environments.  RABBIT 
has  a rudimentary  design  knowledge,  involving  primarily  the  preferral  of 
registers  to  heap-allocated  storage.  However,  there  is  room  for  knowledge 
of  "bit-diddling"  representations. 

(9)  We  suggest  that  those  who  have  tried  to  design  useful  UNCOL's  (UNiversal 
Computer-Oriented  Languages)  [Sammet]  [Coleman]  have  perhaps  been  thinking 
too  imperatively,  and  worrying  more  about  data  manipulation  primitives  than 
about  environment  and  control  issues.  As  a result,  proposed  UNCOLs  have 
been  little  more  than  generalizations  of  contemporary  machine  languages.  We 
suggest  that  SCHEME  makes  an  ideal  UNCOL  at  two  levels.  The  first  level  is 
the  fully  applicative  level,  to  which  most  source-language  constructs  are 
easily  reduced;  the  second  is  the  continuation-passing  style  level,  which 
is  easily  reduced  to  machine  language.  We  envision  building  a compiler  in 
three  stages:  (a)  reduction  of  a user  language  to  basic  SCHEME,  whether  by 
macros,  a parser  of  algebraic  syntax,  or  some  other  means;  (b)  optimization 
by  means  of  SCHEME-level  source-to-source  transformations,  and  conversion  to 
continuation-passing  style;  and  (c)  generation  of  code  for  a particular 
machine.  RABBIT  addresses  itself  to  the  second  stage.  Data  manipulation 
primitives  are  completely  ignored  at  this  stage,  and  are  just  passed  along 
from  input  to  output.  These  primitives,  whether  integer  arithmetic,  string 
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concatenation  and  parsing,  or  list  structure  manipulators,  are  chosen  as  a 
function  of  a particular  source  language  and  a particular  target  machine. 
RABBIT  deals  only  with  fundamental  environment  and  control  issues  common  to 
most  modes  of  algorithmic  expression. 

(10)  While  syntactic  issues  tend  to  be  rather  superficial,  we  point  out  that 
algebraic  syntax  tends  to  obscure  the  fundamental  nature  of  function  calling 
and  tail-recursion  by  arbitrarily  dividing  functions  into  syntactic  classes 
such  as  "operators"  and  "functions".  ([Standish],  for  example,  uses  much 
space  to  exhibit  each  conceptually  singular  transformation  in  a multiplicity 
of  syntactic  manifestations.)  The  lack  of  an  "anonymous"  notation  for 
functions  in  most  algebraic  languages,  and  the  inability  to  treat  functions 
as  data  objects,  is  a distinct  disadvantage.  The  uniformity  of  LISP  syntax 
makes  these  issues  easier  to  deal  with. 

To  the  LISP  community  in  particular  we  address  these  additional  points: 

(11)  Lexical  scoping  need  not  be  as  expensive  as  is  commonly  thought.  Experience 
with  lexically-scoped  interpreters  is  misleading;  lexical  scoping  is  not 
inherently  slower  than  dynamic  scoping.  While  some  implementations  may 
entail  access  through  multiple  levels  of  structure,  this  occurs  only  under 
circumstances  (accessing  of  variables  through  multiple  levels  of  closure) 
which  could  not  even  be  expressed  in  a dynamically  scoped  language.  Unlike 
deep-bound  dynamic  variables,  compiled  lexical  access  requires  no  search; 
unlike  shallow-bound  dynamic  variables,  lexical  binding  does  not  require 
that  values  be  put  in  a canonical  value  cell.  The  compiler  has  complete 
discretion  over  the  manipulation  of  environments  and  variable  values.  The 
"display"  technique  used  in  Algol  implementations  can  be  generalized  to 
provide  an  efficient  solution  to  the  FUNARG  problem. 

(12)  Lexical  scoping  does  not  necessarily  make  LISP  programming  unduly  difficult. 
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The  very  existence  of  RABBIT,  a working  compiler  some  fifty  pages  in  length 
written  in  SCHEME,  first  implemented  in  about  a month,  part-time, 
substantiates  this  claim  (which  is,  however,  admitted  to  be  mostly  a matter 
of  taste  and  experience).  (Note  Refinement  of  RABBIT)  SCHEME  has  also  been 
used  to  implement  several  AI  problem-solving  languages,  including  AHORD 
[Doyle]. 


15 


2.  The  Source  Language  - SCHEME 

The  basic  language  processed  by  RABBIT  is  a subset  of  the  SCHEME  language 
as  described  in  [SCHEME]  and  [Revised  Report],  the  primary  restrictions  being 
that  the  first  argument  to  ASET  must  be  quoted  and  that  the  multiprocessing 
primitives  are  not  accommodated.  This  subset  is  summarized  here. 

SCHEME  is  essentially  a lexically  scoped  ("full  funarg")  dialect  of  LISP. 
Interpreted  programs  are  represented  by  S-expressions  in  the  usual  manner . 
Numbers  represent  themselves.  Atomic  symbols  are  used  as  identifiers  (with  the 
conventional  exception  of  T and  NIL,  which  are  conceptually  treated  as 
constants).  All  other  constructs  are  represented  as  lists. 

In  order  to  distinguish  the  various  other  constructs,  SCHEME  follows  the 
usual  convention  that  a list  whose  car  is  one  of  a set  of  distinguished  atomic 
symbols  is  treated  as  directed  by  a rule  associated  with  that  symbol.  All  other 
lists  (those  with  non-atomic  cars,  or  with  undistinguished  atoms  in  their  cars) 
are  combinations,  or  function  calls.  All  subforms  of  the  list  are  uniformly 
evaluated  in  an  unspecified  order,  and  then  the  value  of  the  first  (the  function) 
is  applied  to  the  values  of  all  the  others  (the  arguments).  Notice  that  the 
function  position  is  evaluated  in  the  same  way  as  the  argument  positions  (unlike 
most  other  LI SP  systems).  (In  order  to  be  able  to  refer  to  MacLISP  functions, 
global  identifiers  evaluate  to  a special  kind  of  functional  object  if  they  have 
definitions  as  MacLISP  functions  of  the  EXPR,  SUBR,  or  LSUBR  varieties.  Thus 
"(PLUS  1 2)"  evaluates  to  3 because  the  values  of  the  subforms  are  <functional 
object  for  PLUS>,  1,  and  2;  and  applying  the  first  to  the  other  two  causes 
invocation  of  the  MacLISP  primitive  PLUS.) 

The  atomic  symbols  which  distinguish  special  constructs  are  as  follows: 
LAMBDA  This  denotes  a function.  A form  (LAMBDA  (varl  var2  ...  varn)  body) 
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will  evaluate  to  a function  of  n arguments.  The  parameters  vari  are 
identifiers  (atomic  symbols)  which  may  be  used  in  the  body  to  refer  to 
the  respective  arguments  when  the  function  is  invoked.  Note  that  a 
LAMBDA-expression  is  not  a function,  but  evaluates  to  one,  a crucial 
distinction . 

IF  This  denotes  a conditional  form.  (IF  a b c)  evaluates  the  predicate  a, 

producing  a value  x;  if  x is  non-NIL,  then  the  consequent  b is 
evaluated,  and  otherwise  the  alternative  c.  If  c is  omitted,  NIL  is 
assumed. 

QUOTE  As  in  all  LISP  systems,  this  provides  a way  to  specify  any  S-expression 
as  a constant.  (QUOTE  x)  evaluates  to  the  S-expression  x.  This  may  be 
abbreviated  to  'x,  thanks  to  the  HacLISP  read-macro-character  feature. 

LABELS  This  primitive  permits  the  local  definition  , of  one  or  more  mutually 
recursive  functions.  The  format  is: 

(LABELS  ( (namel  (LAMBDA  ...)) 

(nameZ  (LAMBDA  ...)) 

(namen  (LAMBDA  ...))) 
body) 

This  evaluates  the  body  in  an  environment  in  which  the  names  refer  to 
the  respective  functions,  which  are  themselves  closed  in  that  same 
environment.  Thus  references  to  these  names  in  the  bodies  of  the 
LAMBDA-expressions  will  refer  to  the  labelled  functions.  {Note 
Generalized  LABELS) 

ASET'  This  is  the  primitive  side-effect  on  variables.  (ASET*  var  body) 
evaluates  the  body,  assigns  the  resulting  value  to  the  variable  var, 
and  returns  that  value.  {Note  Non-quoted  ASET)  For  implementation- 
dependent  reasons,  it  is  forbidden  by  RABBIT  to  use  ASET'  on  a global 
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variable  which  is  the  name  of  a primitive  NacLISP  function,  or  on  a 
variable  bound  by  LABELS.  (ASET1  is  actually  used  very  seldom  in 
practice  anyway,  and  all  these  restrictions  are  "good  programming 
practice".  RABBIT  could  be  altered  to  lift  these  restrictions,  at  some 
expense  and  labor.) 

CATCH  This  provides  an  escape  operator  facility.  [Landin]  [Reynolds]  (CATCH 
var  body)  evaluates  the  body,  which  may  refer  to  the  variable  var, 
which  will  denote  an  "escape  function"  of  one  argument  which,  when 
called,  will  return  from  the  CATCH-form  with  the  given  argument  as  the 
value  of  the  CATCH-form.  Note  that  it  is  entirely  possible  to  return 
from  the  CATCH-form  several  times.  This  raises  a difficulty  with 
optimization  which  will  be  discussed  later. 

Macros  Any  atomic  symbol  which  has  been  defined  in  one  of  various  ways  to  be  a 
macro  distinguishes  a special  construct  whose  meaning  is  determined  by 
a macro  function.  This  function  has  the  responsibility  of  rewriting 
the  form  and  returning  a new  form  to  be  evaluated  in  place  of  the  old 
one.  In  this  way  complex  syntactic  constructs  can  be  expressed  in 

terms  of  simpler  ones. 
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3.  The  Target  Language 

The  "target  language”  is  a highly  restricted  subset  of  HacLISP,  rather 
than  any  particular  machine  language  for  an  actual  hardware  machine  such  as  the 
PDP-10.  RABBIT  produces  HacLISP  function  definitions  which  are  then  compiled  by 
the  standard  HacLISP  compiler.  In  this  way  we  do  not  need  to  deal  with  the 
uninteresting  vagaries  of  a particular  piece  of  hardware,  nor  with  the 
peculiarities  of  the  many  and  various  data-manipulation  primitives  (CAR,  RPLACA, 
♦ , etc.).  We  allow  the  HacLISP  compiler  to  deal  with  them,  and  concentrate  on 
the  issues  of  environment  and  control  which  are  unique  to  SCHEHE.  While  for 
production  use  this  is  mildly  inconvenient  (since  the  code  must  be  passed  through 
two  compilers  before  use),  for  research  purposes  it  has  saved  the  wasteful  re- 
implementation of  much  knowledge  already  contained  in  the  HacLISP  compiler. 

On  the  other  hand,  the  use  of  HacLISP  as  a target  language  does  not  by 
any  means  trivialize  the  task  of  RABBIT.  The  HacLISP  function-calling  mechanism 
cannot  be  used  as  a target  construct  for  the  SCHEHE  function  call,  because 
HacLISP' s function  calls  are  not  guaranteed  to  behave  tail-recursively.  Since 
tail-recursion  is  a most  crucial  characteristic  distinguishing  SCHEHE  from  most 
LISP  systems,  we  must  implement  SCHEHE  function  calls  by  more  primitive  methods. 
Similarly,  since  SCHEHE  is  a full-funarg  dialect  of  LISP  while  HacLISP  is  not,  we 
cannot  in  general  use  HacLISP' s variable-binding  mechanisms  to  implement  those  of 
SCHEHE.  On  the  other  hand,  it  is  a perfectly  legitimate  optimization  to  use 
HacLISP  mechanisms  in  those  limited  situations  where  they  are  applicable. 

Aside  from  ordinary  HacLISP  data-manipulation  primitives,  the  only 
HacLISP  constructs  used  in  the  target  language  are  PROG,  GO,  RETURN,  PROGN,  COND, 
SETQ,  and  ((LAHBDA  ...)  ...).  PROG  is  never  nested;  there  is  only  a single, 
outer  PROG.  RETURN  is  used  only  in  the  form  (RETURN  NIL)  to  exit  this  outer 
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PROG;  it  is  never  used  to  return  a value  of  any  kind.  LAMBDA-expressions  are 
used  only  to  bind  temporary  variables.  In  addition,  CONS,  CAR,  CDR,  RPLACA,  and 
RPLACD  are  used  in  the  creation  and  manipulation  of  environments. 

We  may  draw  a parallel  between  each  of  these  constructs  and  an  equivalent 
machine-language  (or  rather,  assembly  language)  construct: 

PROG  A single  program  module. 

GO  A branch  instruction.  PROG  tags  correspond  to  instruction  labels. 

RETURN  Exit  from  program  module. 

PROGN  Sequencing  of  several  instructions. 

COND  Conditional  branches,  used  in  a disciplined  manner.  One  may  think  of 


(COND  (predl  valuel) 
(predZ  valueZ) 

(predn  valuen)) 


as  representing  the  sequence  of  code 


Ccode  for  predl> 

JUMP- IF-NIL  regl.TAGl 
<code  for  valuel> 

JUMP  ENDTAG 

TAG1:  <code  for  predZ> 

JUNP-IF-NIL  regl.TAGZ 
<code  for  valueZ> 

JUMP  ENDTAG 

TAGZ : 

<code  for  predn> 
JUMP-IF-NIL  regl.TAGn 
<code  for  valuen> 

JUMP  ENDTAG 

TAGn : LOAD-VALUE  NIL 

ENDTAG : 


SETQ 


which  admits  of  some  optimizations,  but  we  shall  not  worry  about  this. 
(The  MacLISP  compiler  does,  but  we  do  not  depend  at  all  on  this  fact.) 
Load  register,  or  store  into  memory. 
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LAMBDA 


CONS 


We  use  this  primarily  in  the  form: 

((LAMBDA  (ql  ...  qn) 

(setq  varl  ql) 

(setq  varn  qn)) 
valuel  ...  valuen) 

which  we  may  think  of  as  saving  values  on  a temporary  stack  and  then 
popping  them  into  the  variables: 

<code  for  valuel>  ;leaves  result  in  regl 

PUSH  regl 

<code  for  valuen> 

PUSH  regl 
POP  varn 

POP  varl 

This  is  in  fact  approximately  how  the  MacLISP  compiler  will  treat  this 
construct.  This  is  used  to  effect  the  simultaneous  assignment  of 
several  values  to  several  registers.  It  would  be  possible  to  do 
without  the  MacLISP  LAMBDA  in  this  case,  by  using  extra  intermediate 
variables,  but  it  was  decided  that  this  task  was  less  interesting  than 
other  issues  within  RABBIT,  and  that  assignments  of  this  kind  would 
occur  sufficiently  often  that  it  was  desirable  to  get  the  MacLISP 

compiler  to  produce  the  best  possible  code  in  this  case. 

The  form  ((LAMBDA  ...)  ...)  is  also  used  in  some  situation  where  the 

user  wrote  such  a form  in  the  SCHEME  code,  and  the  arguments  and 

LAMBDA-body  are  all  "trivial",  in  a sense  to  be  defined  later. 

CONS  is  used,  among  other  things,  to  "push"  new  values  onto  the  current 

environment.  While  SCHEME  variables  can  sometimes  be  represented  as 
temporary  MacLISP  variables  using  LAMBDA,  in  general  they  must  be  kept 
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CAR,  + 


in  a "consed  environment"  in  the  heap;  CAR  and  CDR  are  used  to  "index" 
the  environment  "stack"  (which  is  not  really  a stack,  but  in  general 
tree-like).  (N.B.  By  using  CONS  for  this  purpose  we  can  push  the 
entire  issue  of  environment  retention  off  onto  the  LISP  garbage 
collector.  It  would  be  possible  to  use  array-like  blocks  for 
environments,  and  an  Algol-like  "display"  pointer  discipline  for 
variable  access.  However,  a retention  strategy  as  opposed  to  a 

deletion  strategy  must  be  used  in  general,  because  SCHEME,  unlike  Algol 
60  and  68,  permits  procedures  to  be  the  values  of  other  procedures. 
Stack  allocation  does  not  suffice  in  general  — a heap  must  be  used. 
Later  we  will  see  that  RABBIT  uses  stack  allocation  of  environments  and 
a on  strategy  in  simple  cases,  and  reverts  to  heap  allocation  of 

s and  a retention  strategy  in  more  complicated  situations.) 
Primitive  MacLISP  operators  such  as  * and  CAR  are  analogous  to  machine- 
language  instructions  such  as  ADD  and  LOAD-INDEXED.  We  leave  to  the 
MacLISP  compiler  the  task  of  compiling  large  expressions  involving 
these;  but  we  are  not  avoiding  the  associated  difficult  issues  such  as 
register  allocation,  for  we  shall  have  to  deal  with  them  in  compiling 
calls  to  SCHEME  functions. 
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4.  The  Target  Machine 

Compiled  code  is  interfaced  to  the  SCHEME  interpreter  in  two  ways.  The 
interpreter  must  be  able  to  recognize  functional  objects  which  happen  to  be 
compiled  and  to  invoke  them  with  given  arguments;  and  compiled  code  must  be  able 
to  invoke  any  function,  whether  interpreted  or  compiled,  with  given  arguments. 
(This  latter  interface  is  traditionally  known  as  the  "UUO  Handler"  as  the  result 
of  the  widespread  use  of  the  PDP-10  in  implementing  LISP  systems.  [DEC]  [Moon] 
[Teitelman])  We  define  here  an  arbitrary  standard  form  for  functional  objects, 
and  a standard  means  for  invoking  them. 

In  the  PDP-10  MacLISP  implementation  of  SCHEME,  a function  is,  in 
general,  represented  as  a list  whose  car  contains  one  of  a set  of  distinguished 
atomic  symbols.  (Notice  that  LAMBDA  is  not  one  of  these;  a LAMBDA-expression 
may  evaluate  to  a function,  but  is  not  itself  a valid  function.)  This  set  of 
symbols  includes  EXPR,  SUBR,  and  LSUBR,  denoting  primitive  MacLISP  functions  of 
those  respective  types;  BETA,  denoting  a SCHEME  function  whose  code  is 
interpretive;  DELTA,  denoting  an  escape  function  created  by  the  interpreter  for 
a CATCH  form,  or  a continuation  given  by  the  interpreter  to  compiled  code; 
CBETA,  denoting  a SCHEME  function  or  continuation  whose  code  is  compiled;  and 
EPSILON,  denoting  a continuation  created  when  compiled  code  invokes  interpreted 
code.  Each  of  these  function  types  requires  a different  invocation  convention; 
the  interpreter  must  distinguish  these  types  and  invoke  them  in  the  appropriate 
manner.  For  example,  to  invoke  an  EXPR  the  MacLISP  FUNCALL  construct  must  be 
used.  A BETA  must  be  invoked  by  creating  an  appropriate  environment,  using  the 
given  arguments,  and  then  interpreting  the  code  of  the  function. 

We  have  arbitrarily  defined  the  CBETA  interface  as  follows:  there  are  a 
number  of  "registers",  in  the  form  of  global  variables.  Nine  registers  called 
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**CONT**,  **ONE**,  **TWO**,  . ..,  **EIGHT**  are  used  to  pass  arguments  to  compiled 
functions.  **CONT**  contains  the  continuation.  The  others  contain  the  arguments 
prescribed  by  the  user;  if  there  are  more  than  eight  arguments,  however,  then 
they  are  passed  as  a list  of  all  the  arguments  in  register  **ONE**,  and  the 
others  are  unused.  (Any  of  a large  variety  of  other  conventions  could  have  been 
chosen,  such  as  the  first  seven  arguments  in  seven  registers  and  a list  of  all 
remaining  arguments  in  **EIGHT**.  We  merely  chose  a convention  which  would  be 
workable  and  convenient,  reflect  the  typical  finiteness  of  hardware  register 
sets,  and  mirror  familiar  LISP  conventions.  The  use  of  a list  of  arguments  is 
analogous  to  the  passing  of  an  arbitrary  number  of  arguments  on  a stack, 
sometimes  known  as  the  LSUBR  convention.  [Moon]  [Declarative]) 

There  is  another  register  called  **FUN**.  A function  is  invoked  by 
putting  the  functional  object  in  **FUN**,  its  arguments  in  the  registers  already 
described,  and  the  number  of  arguments  in  the  register  **NARGS**,  and  then 
exiting  the  current  function.  Control  (at  the  MacLISP  level)  is  then  transferred 
to  a routine  (the  "SCHEME  UUO  handler”)  which  determines  the  type  of  the  function 
in  **FUN**  and  invokes  it. 

A continuation  is  invoked  in  exactly  the  same  manner  as  any  other  kind  of 
function,  with  two  exceptions:  a continuation  does  not  itself  require  a 
continuation,  so  **CONT**  need  not  be  set  up;  and  a continuation  always  takes  a 
single  argument,  so  **NARGS**  need  not  be  set  to  1.  {Note  Multiple- 
Argument  Continuations) 

A CBETA  form  has  additional  fixed  structure.  Besides  the  atomic  symbol 
CBETA  in  the  car,  there  is  always  in  the  cadr  the  address  of  the  code,  and  in  the 
cddr  the  environment.  The  form  of  the  environment  is  completely  arbitrary  as  far 
as  the  SCHEME  interpreter  is  concerned;  indeed,  the  CHEAPY  compiler  and  the 
RABBIT  compiler  use  completely  different  formats  for  environments  for  compiled 


function.  (Recall  that  this  cannot  matter  since  the  only  code  which  will  ever  be 
able  to  access  that  environment  is  the  code  belonging  the  the  functional  closure 
of  which  that  environment  is  a part.)  The  "UUO  handler*  puts  the  cddr  of  **FUN** 
in  the  register  **ENV**,  and  then  transfers  to  the  address  in  the  cadr  of 
**FUN**.  When  that  code  eventually  exits,  control  returns  to  the  "UUO  handler", 
which  expects  the  code  to  have  set  up  **FUN**  and  any  necessary  arguments. 

There  is  a set  of  "memory  locations”  - 1 1 - , -12-,  ...  which  are  used  to 
hold  intermediate  quantities  within  a single  user  function.  (Later  we  shall  see 
that  we  think  of  these  as  being  used  to  pass  values  between  internally  generated 
functions  within  a module.  For  this  purpose  we  think  of  the  "registers"  and 

"memory  locations"  being  arranged  in  a single  sequence  **CONT**,  **ONE** 

**EIGHT**,  -11-,  -12-,  ...  There  is  in  principle  an  unbounded  number  of  these 
"memory  locations",  but  RABBIT  can  determine  (and  in  fact  outputs  as  a 
declaration  for  the  MacLISP  compiler)  the  exact  set  of  such  locations  used  by  any 
given  function.)  One  may  think  of  the  "memory  locations”  as  being  local  to  each 
module,  since  they  are  never  used  to  pass  information  between  modules;  in 
practice  they  are  implemented  as  global  MacLISP  variables. 

The  registers  **FUN**,  **NARGS**,  **ENV**,  and  the  argument  registers  are 
the  only  global  registers  used  by  compiled  SCHEME  code  (other  than  the  "memory 
locations").  Except  for  global  variables  explicitly  mentioned  by  the  user 
program,  all  communication  between  compiled  SCHEME  functions  is  through  these 
registers.  It  is  useful  to  note  that  the  continuation  in  **C0NT**  is  generally 
analogous  to  the  usual  "control  stack"  which  contains  return  addresses,  and  so  we 
may  think  of  **CONT**  as  our  "stack  pointer  register". 
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5.  Language  Design  Considerations 

SCHEME  is  a lexically  scoped  { "full-funarg* ) dialect  of  LISP,  and  so  is 
an  applicative  language  which  conforms  to  the  spirit  of  the  lambda-calculus. 
[Church]  We  divide  the  definition  of  the  SCHEME  language  into  two  parts:  the 
environment  and  control  constructs,  and  the  data  manipulation  primitives. 
Examples  of  the  former  are  LAMBDA-expressions,  combinations,  and  IF;  examples  of 
the  latter  are  CONS,  CAR,  EQ,  and  PLUS.  Note  that  we  can  conceive  of  a version 
of  SCHEME  which  did  not  have  CONS,  for  example,  and  more  generally  did  not  have 
S-expressions  in  its  data  domain.  Such  a version  would  still  have  the  same 
environment  and  control  constructs,  and  so  would  hold  the  same  theoretical 
interest  for  our  purposes  here.  (Such  a version,  however,  would  be  less 
convenient  for  purposes  of  writing  a meta-circular  description  of  the  language, 
however! ) 

By  the  "spirit  of  lambda-calculus"  we  mean  the  essential  properties  of 
the  axioms  obeyed  by  lambda-calculus  expressions.  Among  these  are  the  rules  of 
alpha -con  vers  ion  and  beta-conversion . The  first  intuitively  implies  that  we  can 
uniformly  rename  a function  parameter  and  all  references  to  it  without  altering 
the  meaning  of  the  function.  An  important  corollary  to  this  is  that  we  can  in 
fact  effectively  locate  all  the  references.  The  second  implies  that  in  a 
« situation  where  a known  function  is  being  called  with  known  argument  expressions, 

we  may  substitute  an  argument  expression  for  a parameter  reference  within  the 
body  of  the  function  (provided  no  naming  conflicts  result,  and  that  certain 
restrictions  involving  side  effects  are  met).  Both  of  these  operations  are  of 
importance  to  an  optimizing  compiler.  Another  property  which  follows  indirectly 
is  that  of  tail-recursion.  This  property  is  exploited  in  expressing  iteration  in 


terms  of  applicative  constructs,  and  is  discussed  in  some  detail  in 
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[Declarative]. 

We  realize  that  other  systems  of  environment  and  control  constructs  also 
are  reasonably  concise,  clear,  and  elegant,  and  can  be  axiomatized  in  useful 
ways,  for  example  the  guarded  commands  of  Dijkstra.  [Dijkstra]  However,  that  of 
lambda-calculus  is  extremely  well-understood,  lends  itself  well  to  certain  kinds 
of  optimizations  in  a natural  manner,  and  has  behind  it  a body  of  literature 
which  can  be  used  directly  by  RABBIT  to  express  non-primitive  constructs. 

The  desire  for  uniform  lexical  scoping  arises  from  other  motives  as  well, 
some  pragmatic,  some  philosophical.  Many  of  these  are  described  in  [SCHEME], 
[Imperative],  [Declarative],  and  [Revised  Report].  It  is  often  difficult  to 
explain  some  of  these  to  those  who  are  used  to  dynamically  scoped  LISP  systems. 
Any  one  advantage  of  lexical  scoping  may  often  be  countered  with  "Yes,  but  you 
can  do  that  in  this  other  way  in  a dynamically  scoped  LISP."  However,  we  are 
convinced  that  lexical  scoping  in  its  totality  provides  all  of  the  advantages  to 
be  described  in  a natural,  elegant,  and  integrated  manner,  largely  as  a 
consequence  of  its  ^re at  simplicity. 

There  are  those  to  whom  lexical  scoping  is  nothing  new,  for  example  the 
ALGOL  community.  For  this  audience,  however,  we  should  draw  attention  to  another 
important  feature  of  SCHEME,  which  is  that  functions  are  first-class  data 
objects.  They  may  be  assigned  or  bound  to  variables,  returned  as  values  of  other 
functions,  placed  in  arrays,  and  in  general  treated  as  any  other  data  object. 
Just  as  numbers  have  certain  operations  defined  on  them,  such  as  addition,  so 
functions  have  an  important  operation  defined  on  them,  namely  invocation. 

The  ability  to  treat  functions  as  objects  is  not  at  all  the  same  as  the 
ability  to  treat  representations  of  functions  as  objects.  It  is  the  latter 
ability  that  is  traditionally  associated  with  LISP;  functions  can  be  represented 
as  S-expressions.  In  a version  of  SCHEME  which  had  no  S-expression  primitives. 
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however,  one  could  still  deal  with  functions  (i.e.  closures)  as  such,  for  that 
ability  is  part  of  the  fundamental  environment  and  control  facilities. 

! Conversely,  in  a SCHEME  which  does  have  CONS,  CAR,  and  CDR,  there  is  no  defined 

way  to  use  CONS  by  itself  to  construct  a function  (although  a primitive  ENCLOSE 
is  now  provided  which  converts  an  5-expression  representation  of  a function  into 
a function),  and  the  CAR  or  CDR  of  a function  is  in  general  undefined.  The  only 
defined  operation  on  a function  is  invocation.  (Note  Operations  on  Functions) 

We  draw  this  sharp  distinction  between  environment  and  control  constructs 
on  the  one  hand  and  data  manipulation  primitives  on  the  other  because  only  the 
former  are  treated  in  any  depth  by  RABBIT,  whereas  much  of  the  knowledge  of  a 
"real"  compiler  deals  with  the  latter.  A PL/I  compiler  must  have  much  specific 
knowledge  about  numbers,  arrays,  strings,  and  so  on.  We  have  no  new  ideas  to 
present  here  on  such  issues,  and  so  have  avoided  this  entire  area.  RABBIT  itself 
knows  almost  nothing  about  data  manipulation  primitives  beyond  being  able  to 
recognize  them  and  pass  them  along  to  the  output  code,  which  is  a small  subset  of 
MacLISP . In  this  way  RABBIT  can  concentrate  on  the  interesting  issues  of 
environment  and  control,  and  exploit  the  expert  knowledge  of  data  manipulation 
primitives  already  built  into  the  MacLISP  compiler. 
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6.  The  Use  of  Macros 

• An  important  characteristic  of  the  SCHEME  language  is  that  its  set  of 

primitive  constructs  is  quite  small.  This  set  is  not  always  convenient  for 
expressing  programs,  however,  and  so  a macro  facility  is  provided  for  extending 
the  expressive  power  of  the  language.  A macro  is  best  thought  of  as  a syntax 
rewrite  rine.  As  a simple  example,  suppose  we  have  a primitive  GCD  which  takes 
only  two  arguments,  and  we  wish  to  be  able  to  write  an  invocation  of  a GCD 
function  with  any  number  of  arguments.  We  might  then  define  (in  a "production- 
rule"  style)  the  conditional  rule: 

(XGCD)  =>  0 

( XGCD  x)  =>  x 

(XGCD  x . rest)  =>  (GCD  x (XGCD  . rest)) 

(Notice  the  use  of  LISP  dots  to  refer  to  the  rest  of  a list.)  This  is  not 
considered  to  be  a definition  of  a function  XGCD,  but  a purely  syntactic 
transformation.  In  principle  all  such  transformations  could  be  performed  before 
executing  the  program.  In  fact,  RABBIT  does  exactly  this,  although  the  SCHEME 
interpreter  naturally  does  it  incrementally,  as  each  macro  call  is  encountered. 

Rather  than  use  a separate  production-rule/pattern-matching  language,  in 
practice  SCHEME  macros  are  defined  as  transformation  functions  from  macro-call 
» expressions  to  resulting  S-expressions,  just  as  they  are  in  MacLISP.  (Here, 

however,  we  shall  continue  to  use  production  rules  for  purposes  of  exposition.) 
It  is  important  to  note  that  macros  need  not  be  written  in  the  language  for  which 
they  express  rewrite  rules;  rather,  they  should  be  considered  an  adjunct  to  the 
interpreter,  and  written  in  the  same  language  as  the  interpreter  (or  the 
compiler).  To  see  this  more  clearly,  consider  a version  of  SCHEME  which  does  not 
have  S-expressions  in  its  data  domain.  If  programs  in  this  language  are 
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represented  as  S-expressions,  then  the  interpreter  for  that  language  cannot  be 
written  in  that  language,  but  in  another  meta- language  which  does  deal  with  S- 
expressions.  Macros,  which  transform  one  S-expression  (representing  a macro 
call)  to  another  (the  replacement  form,  or  the  interpretation  of  the  call), 
clearly  should  be  expressed  in  this  meta-language  also.  The  fact  that  in  most 
LISP  systems  the  language  and  the  meta-language  appear  to  coincide  is  a source  of 
both  power  and  confusion. 

In  the  PDP-10  MacLISP  implementation  of  SCHEME,  four  separate  macro 
mechanisms  are  used  in  practice.  One  is  the  MacLISP  read-macro  mechanism  [Moon], 
which  performs  transformations  such  as  'F00  =>  (QUOTE  FOO)  when  an  expression  is 
read  from  a file.  The  other  three  are  as  described  earlier,  processed  by  the 
interpreter  or  compiler,  and  differ  only  in  that  one  kind  is  recognized  by  the 
MacLISP  interpreter  as  well  while  the  other  two  are  used  only  by  SCHEME,  and  that 
of  the  latter  two  one  kind  is  written  in  MacLISP  and  the  other  kind  in  SCHEME 
itself. 

There  is  a growing  library  of  SCHEME  macros  which  express  a variety  of 
traditional  programming  constructs  in  terms  of  other,  more  primitive  constructs, 
and  eventually  in  terms  of  the  small  set  of  primitives.  A number  of  these  are 
catalogued  in  [Imperative]  and  [Revised  Report].  Others  were  invented  in  the 
course  of  writing  RABBIT.  We  shall  give  some  examples  here. 

The  BLOCK  macro  is  similar  to  the  MacLISP  PROGN;  it  evaluates  all  its 
arguments  and  returns  the  value  of  the  last  one.  One  critical  characteristic  is 
that  the  last  argument  is  evaluated  "tail-recursively"  (I  use  horror  quotes 
because  normally  we  speak  of  invocation,  not  evaluation,  as  being  tail- 
recursive).  An  expansion  rule  is  given  for  this  in  [Imperative]  equivalent  to: 

(BLOCK  x)  =>  x 

(BLOCK  x . rest)  «>  ((LAMBDA  (DUMMY)  (BLOCK  . rest))  x) 
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This  definition  exploits  the  fact  that  SCHEME  is  evaluated  in  applicative  order, 
and  so  will  evaluate  all  arguments  before  applying  a function  to  them.  Thus,  in 
the  second  subrule,  x must  be  evaluated,  and  then  the  block  of  all  the  rest  is. 
It  is  then  clear  from  the  first  subrule  that  the  last  argument  is  evaluated 
"tail-recursively" . 

One  problem  with  this  definition  is  the  occurrence  of  the  variable  DUMMY, 
which  must  be  chosen  so  as  not  to  conflict  with  any  variable  used  by  the  user. 
This  we  refer  to  as  the  "GENSYM  problem",  in  honor  of  the  traditional  LISP 
function  which  creates  a "fresh"  symbol.  It  would  be  nicer  to  write  the  macro  in 
such  a way  that  no  conflict  could  arise  no  matter  what  names  were  used  by  the 
user.  There  is  indeed  a way,  which  ALGOL  programmers  will  recognize  as 
equivalent  to  the  use  of  "thunks",  or  call-by-name  parameters: 

(BLOCK  x)  =>  x 

(BLOCK  x . rest)  =>  ((LAMBDA  (A  B)  (B)) 

x 

(LAMBDA  ()  (BLOCK  . rest))) 

Consider  carefully  the  meaning  of  the  right-hand  side  of  the  second  subrule. 
First  the  expression  x and  the  (LAMBDA  ()  ...)  must  be  evaluated  (it  doesn’t 
matter  in  which  order!);  the  result  of  the  latter  is  a function  (that  is,  a 
closure),  which  is  later  invoked  in  order  to  evaluate  the  rest  of  the  arguments. 
There  can  be  no  naming  conflicts  here,  because  the  scope  of  the  variables  A and  B 
(which  is  lexical)  does  not  contain  any  of  the  arguments  to  BLOCK  written  by  the 
user.  (We  should  note  that  we  have  been  sloppy  in  speaking  of  the  "arguments"  to 
BLOCK,  when  BLOCK  is  properly  speaking  not  a function  at  all,  but  merely  a 
syntactic  keyword  used  to  recognize  a situation  where  a syntactic  rewriting  rule 
is  applicable.  We  would  do  better  to  speak  of  "argument  expressions"  or  "macro 
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arguments",  but  we  shall  continue  to  be  sloppy  where  no  confusion  should  arise.) 

This  is  a technique  which  should  be  understood  quite  thoroughly,  since  it 
is  the  key  to  writing  correct  macro  rules  without  any  possibility  of  conflicts 
between  names  used  by  the  user  and  those  needed  by  the  macro.  As  another 
example,  let  us  consider  the  AND  and  OR  constructs  as  used  by  most  LISP  systems. 
OR  evaluates  its  arguments  one  by  one,  in  order,  returning  the  first  non-NIL 
value  obtained  (without  evaluating  any  of  the  following  arguments),  or  NIL  if  all 
arguments  produce  NIL.  AND  is  the  dual  to  this;  it  returns  NIL  if  any  argument 
does,  and  otherwise  the  value  of  the  last  argument.  A simple-minded  approach  to 
OR  would  be: 

(OR)  =>  'NIL 

(OR  x . rest)  =>  (IF  x x (OR  . rest)) 

There  is  an  objection  to  this,  which  is  that  the  code  for  x is  duplicated.  Not 
only  does  this  consume  extra  space,  but  it  can  execute  erroneously  if  x has  any 
side-effects.  We  must  arrange  to  evaluate  x only  once,  and  then  test  its  value: 

(OR)  =>  'NIL 

(OR  x . rest)  =>  ( ( LAMBDA  (V)  (IF  V V (OR  . rest)))  x) 

This  certainly  evaluates  x only  once,  but  admits  a possible  naming  conflict 
between  the  variable  V and  any  variables  used  by  rest.  This  is  avoided  by  the 
i same  technique  used  for  BLOCK: 

(OR)  =>  'NIL 

(OR  x . rest)  =>  ((LAMBDA  (V  R)  (IF  V V (R))) 

x 

(LAMBDA  ()  (OR  . rest))) 


Similarly,  we  can  express  AND  as  follows: 
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(AND)  =>  'T 

(AND  x)  =>  x 

(AND  x . rest)  =>  ((LAMBDA  (V  R)  (IF  V (R)  'NIL)) 

x 

(LAMBDA  ()  (AND  . rest))) 

(The  macro  rules  are  not  precise  duals  because  of  the  non-duality  between  NIL- 
ness  and  non-NIL-ness,  and  the  requirement  that  a successful  AND  return  the 
actual  value  of  the  last  argument  and  not  just  T.)  {Note  Tail-Recursive  OR) 

As  yet  another  example,  consider  a modification  to  BLOCK  to  allow  a 
limited  form  of  assignment  statement:  if  (v  :=  x)  appears  as  a statement  in  a 
block,  it  "assigns"  a value  to  the  variable  v whose  scope  is  the  remainder  of  the 
block.  Let  us  assume  that  such  a statement  cannot  occur  as  the  last  statement  of 
a block  (it  would  be  useless  to  have  one  in  that  position,  as  the  variable  would 
have  a null  scope).  We  can  write  the  rule: 

(BLOCK  x)  =>  x 

(BLOCK  (v  :=  x)  . rest)  =>  ((LAMBDA  (v)  (BLOCK  . rest))  x) 

(BLOCK  x . rest)  =>  ((LAMBDA  (A  B)  (B)) 

x 

(LAMBDA  ()  (BLOCK  . rest))) 

The  second  subrule  states  that  an  "assignment”  causes  x to  be  evaluated  and  then 
bound  to  v,  and  that  the  variable  v is  visible  to  the  rest  of  the  block. 

We  may  think  of  :=  as  a "sub-macro  keyword"  which  is  used  to  mark  an 
expression  as  suitable  for  transformation,  but  only  in  the  context  of  a certain 
larger  transformation.  This  idea  is  easily  extended  to  allow  other 
constructions,  such  as  "simultaneous  assignments"  of  the  form 

((varl  var2  ...  varn)  :=  valuel  value2  ...  valuen) 

which  first  compute  all  the  values  and  then  assign  to  all  the  variables,  and 
"exchange  assignments"  of  the  form  (X  : = : Y),  as  follows: 
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(BLOCK  x)  =>  x 

(BLOCK  (v  :=  x)  . rest) 

=>  ((LAMBDA  (v)  (BLOCK  . rest))  x) 

(BLOCK  (vars  :=  . values)  . rest) 

= > ((LAMBDA  vars  (BLOCK  . rest))  . values) 
(BLOCK  (x  :=:  y)  . rest) 

=>  ((LAMBDA  (x  y)  (BLOCK  . rest))  y x) 
(BLOCK  x . rest)  =>  ((LAMBDA  (A  B)  (B)) 

x 

(LAMBDA  ()  (BLOCK  . rest))) 


Let  us  now  consider  a rule  for  the  sore  complicated  CJND  construct: 


(COND)  =>  'NIL 

(COND  (x)  . rest)  =>  (OR  x (COND  . rest)) 

(COND  (x  . r)  . rest)  =>  (IF  x (BLOCK  . r)  (COND  . rest)) 


This  defines  the  "extended”  COND  of  modern  LISP  systems,  which  produces  NIL  If  no 
clauses  succeed,  which  returns  the  value  of  the  predicate  in  the  case  of  a 
singleton  clause,  and  which  allows  more  than  one  consequent  in  a clause.  An 
important  point  here  is  that  one  can  write  these  rules  in  terms  of  other  macro 
constructs  such  as  OR  and  BLOCK;  moreover,  any  extensions  to  BLOCK,  such  as  the 
limited  assignment  feature  described  above,  are  automatically  inherited  by  COND. 
Thus  with  the  above  definition  one  could  write 


(COND  ((NUMBERP  X)  (Y  :=  (SQRT  X))  (♦  Y (SORT  Y))) 
(T  (HACK  X))) 


where  the  scope  of  the  variable  Y is  the  remainder  of  the  first  COND  clause. 

SCHEME  also  provides  macros  for  such  constructs  as  DO  and  PROG,  all  of 
which  expand  into  similar  Kinds  of  code  using  LAMBDA,  IF,  and  LABELS  (see  below). 
In  particular,  PROG  permits  the  use  of  GO  and  RETURN  in  the  usual  manner.  In 
this  manner  all  the  traditional  imperative  constructs  are  expressed  in  an 
applicative  manner.  (Note  ASET'  Is  Imperative) 

None  of  this  is  particularly  new;  theoreticians  have  modelled  imperative 
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constructs  in  these  terms  for  years.  What  is  new,  we  think,  is  the  serious 
proposal  that  a practical  interpreter  and  compiler  can  be  designed  for  a language 
in  which  such  models  serve  as  the  sole  definitions  of  these  imperative 
constructs.  {Note  Dijkstra's  Opinion}  This  approach  has  both  advantages  and 
disadvantages . 

One  advantage  is  that  the  base  language  is  small.  A simple-minded 
interpreter  or  compiler  can  be  written  in  a few  hours.  (We  have  re-implemented 
the  SCHEME  interpreter  from  scratch  a dozen  times  or  more  to  test  various 
representation  strategies;  this  was  practical  only  because  of  the  small  size  of 
the  language.  Similarly,  the  CHEAPV  compiler  is  fewer  than  ten  pages  of  code, 
and  could  be  rewritten  in  a day  or  less.)  Once  the  basic  interpreter  is  written, 
the  macro  definitions  for  all  the  complex  constructs  can  be  used  without 
revision.  Moreover,  the  same  macro  definitions  can  be  used  by  both  interpreter 
and  compiler  (or  by  several  versions  of  interpreter  and  compiler!).  Excepting 
the  very  few  primitives  such  as  LAMBDA  and  IF,  it  is  not  necessary  to  "implement 
a construct  twice",  once  each  in  interpreter  and  compiler. 

Another  advantage  is  that  new  macros  are  very  easy  to  write  (using 
facilities  provided  in  SCHEME).  One  can  easily  invent  a new  kind  of  DO  loop,  for 
example,  and  implement  it  in  SCHEME  for  both  interpreter  and  all  compilers  in 
less  than  five  minutes.  (In  practice  such  new  control  constructs,  such  as  the 
ITERATE  loop  described  in  [Revised  Report],  are  indeed  installed  within  five  to 
ten  minutes  of  conception,  in  a routine  manner.) 

A third  advantage  is  that  the  attention  of  the  compiler  can  be  focused  on 
the  basic  constructs.  Rather  than  having  specialized  code  for  two  dozen 
different  constructs,  the  compiler  can  have  much  deeper  knowledge  about  each  of  a 
few  basic  constructs.  One  might  object  that  this  "deeper  knowledge"  consists  of 
recognizing  the  two  dozen  special  cases  represented  by  the  separate  constructs  of 
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the  Cornier  case.  This  is  true  to  some  extent.  It  is  also  true,  however,  that  in 
the  latter  case  such  deep  knowledge  will  carry  over  to  any  new  constructs  which 
are  invented  and  represented  as  macros. 

Among  the  disadvantages  of  the  macro  approach  are  lack  of  speed  and  the 
discarding  of  information.  Many  people  have  objected  that  macros  are  of 
necessity  slower  than,  say,  the  FSUBR  implementation  used  by  most  LISP  systems. 
This  is  true  in  many  current  interpretive  implementations,  but  need  not  be  true 
of  compilers  or  more  cleverly  designed  interpreters.  Moreover,  the  FSUBR 
implementation  is  not  general;  it  is  very  hard  for  a user  to  write  a meaningful 
FSUBR  and  then  describe  to  the  compiler  the  best  way  to  compile  it.  The  macro 
approach  handles  this  difficulty  automatically.  We  do  not  object  to  the  use  of 
the  FSUBR  mechanism  as  a special-case  "speed  hack”  to  improve  the  performance  of 
an  interpreter,  but  we  insist  on  recognizing  the  fact  that  it  is  not  as  generally 
useful  as  the  macro  approach. 

Another  objection  relating  to  speed  is  that  the  macros  produce  convoluted 
code  involving  the  temporary  creation  and  subsequent  invocation  of  many  closures. 
We  feel,  first  of  all,  that  the  macro  writer  should  concern  himself  more  with 
producing  correct  code  than  fast  code.  Furthermore,  convolutedness  can  be 
eliminated  by  a few  simple  optimization  techniques  in  the  compiler,  to  be 
discussed  below.  Finally,  function  calls  need  not  be  as  expensive  as  is 

i 

popularly  supposed.  [Steele] 

Information  is  discarded  by  macros  in  the  situation,  for  example,  where  a 
DO  macro  expands  into  a large  mess  that  is  not  obviously  a simple  loop;  later 
compiler  analysis  must  recover  this  information.  This  is  indeed  a problem.  We 
feel  that  the  compiler  is  probably  better  off  having  to  recover  the  information 
anyway,  since  a deep  analysis  allows  it  to  catch  other  loops  which  the  user  did 
not  use  DO  to  express  for  one  reason  or  another.  Another  is  the  possibility  that 
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DO  could  leave  clues  around  in  the  form  of  declarations  if  desired. 

Another  difficulty  with  the  discarding  of  information  is  the  issuing  of 
meaningful  diagnostic  messages.  The  user  would  prefer  to  see  diagnostics  mention 
the  originally-written  source  constructs,  rather  than  the  constructs  into  which 
the  macros  expanded.  (An  example  of  this  problem  from  another  LISP  compiler  is 
that  it  may  convert  (MEMQ  X '(A  B C))  into  (OR  (EQ  X 'A)  (EQ  X 'B)  (EQ  X ’C)); 
when  by  the  same  rule  it  converts  (MEMQ  X '(A))  (a  piece  of  code  generated  by  a 
macro)  into  (OR  (EQ  X 'A)),  it  later  issues  a warning  that  an  OR  had  only  one 
subform.)  This  problem  can  be  partially  circumvented  if  the  responsibility  for 
syntax-checking  is  placed  on  the  macro  definition  at  each  level  of  expansion. 
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7.  The  Imperative  Treatment  of  Applicative  Constructs 

Given  the  characteristics  of  lexical  scoping  and  tail-recursive 
invocations,  it  is  possible  to  assign  a peculiarly  imperative  interpretation  to 
the  applicative  constructs  of  SCHEME,  which  consists  primarily  of  treating  a 
function  call  as  a GOTO.  More  generally,  a function  call  is  a GOTO  that  can  pass 
one  or  more  items  to  its  target;  the  special  case  of  passing  no  arguments  is 
precisely  a GOTO.  It  is  never  necessary  for  a function  call  to  save  a return 
address  of  any  kind.  It  is  true  that  return  addresses  are  generated,  but  we 
adopt  one  of  two  other  points  of  view,  depending  on  context.  One  is  that  the 
return  address,  plus  any  other  data  needed  to  carry  on  the  computation  after  the 
called  function  has  returned  (such  as  previously  computed  intermediate  values  and 
other  return  addresses)  are  considered  to  be  packaged  up  into  an  additional 
argument  (the  continuation)  which  is  passed  to  the  target.  This  lends  itself  to 
a non-functional  interpretation  of  LAMBDA,  and  a method  of  expressing  programs 
called  the  continuation-passing  style  (similar  to  the  message-passing  actors 
paradigm  [Hewitt]),  to  be  discussed  further  below.  The  other  view,  more 
intuitive  in  terms  of  the  traditional  stack  implementation,  is  that  the  return 
address  should  be  pushed  before  evaluating  arguments  rather  than  before  calling  a 
function.  This  view  leads  to  a more  uniform  function-calling  discipline,  and  is 
discussed  in  [Declarative]  and  [Steele]. 

We  are  led  by  these  points  of  view  to  consider  a compilation  strategy  in 
which  function  calling  is  to  be  considered  very  cheap  (unlike  the  situation  with 
PL/ I and  ALGOL,  where  programmers  avoid  procedure  calls  like  the  plague  --  see 
[Steele]  for  a discussion  of  this).  In  this  light  the  code  produced  by  the 
sample  macros  above  does  not  seem  inefficient,  or  even  particular  ly  convoluted. 
Consider  the  expansion  of  (OR  a b c): 
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((LAMBDA  (V  R)  ( IF  V V (R) ) ) 
a 

(LAMBDA  ()  ((LAMBDA  (V  R)  (IF  V V (R) ) ) 
b 

(LAMBDA  ()  ((LAMBDA  (V  R)  (IF  V V (R))) 
c 

(LAMBDA  ()  ’NIL)))))) 


Then  we  might  imagine  the  following  (slightly  contrived)  compilation  scenario.  , 

First,  for  expository  purposes,  we  shall  rename  the  variables  in  order  to  be  able 
to  distinguish  them. 

1 

((LAMBDA  (VI  Rl)  (IF  VI  Vl’  (Rl))) 
a 

(LAMBDA  ()  ((LAMBDA  (VZ  RZ)  (IF  VZ  VZ  (RZ))) 

b 1 

(LAMBDA  ()  ((LAMBDA  (V3  R3)  (IF  V3  V3  ( R3 ) ) ) 

(LAMBDA  ()  ’NIL))))))  j 


We  shall  assign  a generated  name  to  each  LAMBDA-expression,  which  we  shall  notate 
by  writing  the  name  after  the  word  LAMBDA.  These  names  will  be  used  as  tags  in 
the  output  code. 


((LAMBDA  namel  (VI  Rl)  (IF  VI  VI  (Rl))) 
a 

(LAMBDA  nameZ  ()  ((LAMBDA  name3  (VZ  RZ)  (IF  VZ  VZ  (RZ))) 
b 

(LAMBDA  name4  ()  ((LAMBDA  name5  (V3  R3) 

(IF  V3  V3  (R3) ) ) 
c 

(LAMBDA  name6  ()  'NIL)))))) 


Next,  a simple  analysis  shows  that  the  variables  Rl,  RZ,  and  R3  always  denote  the 
LAMBDA-expressions  named  nameZ,  name4,  and  name6,  respectively.  Now  an  optimizer 
might  simply  have  substituted  these  values  into  the  bodies  of  namel,  name3,  and 
namcS  using  the  rule  of  beta-conversion,  but  we  shall  not  apply  that  technique 
here.  Instead  we  shall  compile  the  six  functions  in  a straightforward  manner. 


We  make  use  of  the  additional  fact  that  all  six  functions  are  closed  in  identical 
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environments  (we  count  two  environments  as  identical  if  they  involve  the  same 
variable  bindings,  regardless  of  the  number  of  "frames"  involved;  that  is,  the 
< environment  is  the  same  inside  and  outside  a (LAMBDA  ()  ...)).  Assume  a simple 

target  machine  with  argument  registers  called  regl,  regZ,  etc. 


main : 

<code  for  a> 

;result  in  regl 

LOAD  regZ,[name2] 

;[name2]  is  the  closure  for  nameZ 

CALL-FUNCTION  2, [namel] 

;call  namel  with  2 arguments 

namel : 

JUMP-IF-NIL  regl .naraela 

RETURN 

; return  the  value  in  regl 

name  la: 

CALL-FUNCTION  0,reg2 

;call  function  in  regZ,  0 arguments 

name2 : 

<code  for  b> 

;result  in  regl 

LOAD  reg2,[name4] 

;[name4]  is  the  closure  for  name4 

CALL-FUNCTION  2,[name3] 

;call  name 3 with  2 arguments 

name3 : 

JUMP-IF-NIL  regl,narao3a 

RETURN 

; return  the  value  in  regl 

name3a : 

CALL-FUNCTION  O.regZ 

;call  function  in  regZ,  0 arguments 

name4 : 

<code  for  c> 

;result  in  regl 

LOAD  reg2,[name6] 

;[name6]  is  the  closure  for  name6 

CALL-FUNCTION  2,[name5] 

;call  name5  with  2 arguments 

name5: 

JUMP-IF-NIL  regl,name5a 

RETURN 

; return  the  value  in  regl 

name5a : 

CALL-FUNCTION  0,reg2 

;call  function  in  reg2,  0 arguments 

name6 : 

LOAD  regl, 'NIL 

RETURN 

;constant  NIL  in  regl 

Now  we  make  use  of  our  knowledge  that  certain  variables  always  denote  certain 
functions,  and  convert  CALL-FUNCTION  of  a known  function  to  a simple  GOTO.  (We 
have  actually  done  things  backwards  here;  in  practice  this  knowledge  is  used 
before  generating  any  code.  We  have  fudged  over  this  issue  here,  but  will  return 
to  it  later.  Our  purpose  here  is  merely  to  demonstrate  the  treatment  of  function 
calls  as  GOTOs . ) 


main:  <code  for  a>  ;result  in  regl 

LOAD  reg2,[name2]  ;[name2]  is  the  closure  for  nameZ 

GOTO  namel 
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namel : 

JUMP-IF-NIL  regl.namela 
RETURN 

; return  the  value  in  regl 

name  la: 

GOTO  nameZ 

nameZ: 

<code  for  b> 

;result  in  regl 

LOAD  regZ,[name4] 

GOTO  name3 

;[name4]  is  the  closure  for 

name4 

name3: 

JUMP-IF-NIL  regl,name3a 
RETURN 

;return  the  value  in  regl 

name3a : 

GOTO  name4 

name4 : 

<code  for  c> 

jresult  in  regl 

LOAD  regZ,[name6] 

GOTO  name5 

;[name6]  is  the  closure  for 

name  6 

name5: 

JUMP-IF-NIL  regl ,name5a 
RETURN 

;return  the  value  in  regl 

name5a: 

GOTO  name6 

name6 : 

LOAD  regl, 'NIL 

RETURN 

;constant  NIL  in  regl 

The  construction  [foo]  indicates  the  creation  of  a closure  for  foo  in  the  current 
environment.  This  will  actually  require  additional  instructions,  but  we  shall 
ignore  the  mechanics  of  this  for  now  since  analysis  will  remove  the  need  for  the 
construction  in  this  case.  The  fact  that  the  only  references  to  the  variables 
Rl,  RZ,  and  R3  are  function  calls  can  be  detected  and  the  unnecessary  LOAD 
instructions  eliminated.  (Once  again,  this  would  actually  be  determined  ahead  of 
time,  and  no  LOAD  instructions  would  be  generated  in  the  first  place.  All  of 
this  is  determined  by  a general  pre-analysis,  rather  than  a peephole  post-pass.) 
Moreover,  a GOTO  to  a tag  which  immediately  follows  the  GOTO  can  be  eliminated. 


T 
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main : 

<code  for  a> 

;result 

in  regl 

namel : 

JUMP- IF-NIL  regl.namela 

RETURN 

jreturn 

the  value 

in 

regl 

name  la : 

name2 : 

<code  for  b> 

; result 

in  regl 

name 3 : 

JUMP- IF-NIL  regl , name3a 

RETURN 

;return 

the  value 

in 

regl 

name3a : 

name4 : 

<code  for  c> 

jresult 

in  regl 

name5 : 

JUMP- IF-NIL  regl ,name5a 

RETURN 

; return 

the  value 

in 

regl 

name5a : 

name6 : 

LOAD  regl , 'NIL 

jconstant  NIL  in  regl 

RETURN 

This  code  is  in  fact  about  what  one  would  expect  out  of  an  ordinary  LISP 
compiler.  (There  is  admittedly  room  for  a little  more  improvement.)  RABBIT 
indeed  produces  code  of  essentially  this  form,  by  the  method  of  analysis  outlined 
here . 

Similar  considerations  hold  for  the  BLOCH  macro.  Consider  the  expression 
(BLOCK  a b c);  conceptually  this  should  perform  a,  b,  and  c sequentially.  Let 
us  examine  the  code  produced: 


( ( LAMBDA  (A  B)  (B)) 
a 

(LAMBDA  ()  ((LAMBDA  (A  B)  (B)) 
b 

(LAMBDA  ()  c)))) 


Renaming  the  variables  and  assigning  names  to  LAMBDA-expressions: 


((LAMBDA  namel  (A1  Bl)  ( B1 ) ) 
a 

(LAMBDA  nameZ  ()  ((LAMBDA  name3  (AZ  BZ)  (BZ)) 
b 

(LAMBDA  name4  ()  c)))) 


Producing  code  for  the  functions: 
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main:  <code  for  a> 

LOAD  reg2,[name2] 
CALL-FUNCTION  2,[namel] 

namel:  CALL-FUNCTION  0,reg2 

name2:  <code  for  b> 

LOAD  reg2,[name4] 
CALL-FUNCTION  2,[name3] 

name3:  CALL-FUNCTION  0,reg2 

name4:  <code  for  c> 

RETURN 


Turning  general  function  calls  into  direct  GO'S,  on  the  basis  of  analysis  of  what 
variables  must  refer  to  constant  functions: 


main:  <code  for  a> 

LOAD  reg2,[name2] 
GOTO  namel 

namel:  GOTO  name2 

name2:  <code  for  b> 

LOAD  reg2,[name4] 
GOTO  name3 

name3:  GOTO  name4 

name4:  <code  for  c> 
RETURN 


Eliminating  useless  GOTO  and  LOAD  instructions: 


main:  <code  for  a> 

namel : 

name2:  <code  for  b> 
name3 : 

name4:  <code  for  c> 
RETURN 


What  more  could  one  ask  for? 

Notice  that  this  has  fallen  out  of  a general  strategy  involving  only  an 
approach  to  compiling  function  calls,  and  has  involved  no  special  knowledge  of  OR 
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or  BLOCK  not  encoded  in  the  macro  rules.  The  cases  shown  so  far  are  actually 
special  cases  oT  a more  general  approach,  special  in  that  all  the  conceptual 
closures  involved  are  closed  in  the  same  environment,  and  called  from  places  that 
have  not  disturbed  that  environment,  but  only  used  "registers".  In  the  more 
general  case,  the  environments  of  caller  and  called  function  will  be  different. 
This  divides  into  two  subcases,  corresponding  to  whether  the  closure  was  created 
by  a simple  LAMBDA  or  by  a LABELS  construction.  The  latter  involves  circular 
references,  and  so  is  somewhat  more  complicated;  but  it  is  easy  to  show  that  in 
the  former  case  the  environment  of  the  caller  must  be  that  of  the  (known)  called 
function,  possibly  with  additional  values  added  on.  This  is  a consequence  of 
lexical  scoping.  As  a result,  the  function  call  can  be  compiled  as  a GOTO 
preceded  by  an  environment  adjustment  which  consists  merely  of  lopping  off  some 
leading  portion  of  the  current  one  (intuitively,  one  simply  "pops  the  unnecessary 
crud  off  the  stack").  LABELS-closed  functions  also  can  be  treated  in  this  way, 
if  one  closes  all  the  functions  in  the  same  way  (which  RABBIT  presently  does,  but 
this  is  not  always  desirable).  If  one  does,  then  it  is  easy  to  see  the  effect  of 
expanding  a PROG  into  a giant  LABELS  as  outlined  in  [Imperative]  and  elsewhere: 
normally,  a GOTO  to  a tag  at  the  same  level  of  PROG  will  involve  no  adjustment  of 
environment,  and  so  compile  into  a simple  GOTO  instruction,  whereas  a GOTO  to  a 
tag  at  an  outer  level  of  PROG  probably  will  involve  adjusting  the  environment 
from  that  of  the  inner  PROG  to  that  of  the  outer.  All  of  this  falls  out  of  the 
proper  imperative  treatment  of  function  calls. 


) 
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8.  Compilation  Strategy 

The  overall  approach  RABBIT  takes  to  the  compilation  of  SCHEME  code  nay 
be  summarized  as  follows: 

(1)  Alpha-conversion  (renaming  of  variables)  and  macro-expansion 
(expansion  of  syntactic  rewrite  rules). 

(2)  Preliminary  analysis  (variable  references,  "trivial"  expressions,  and 
side  effects). 

(3)  Optimization  (meta-evaluation). 

(4)  Conversion  to  continuation-passing  style. 

(5)  Environment  and  closure  analysis. 

(6)  Code  generation. 

During  (1)  a data  structure  is  built  which  is  structurally  a copy  of  the  user 
program  but  in  which  all  variables  have  been  renamed  and  in  which  at  each  "node" 
of  the  program  tree  are  additional  slots  for  extra  information.  These  slots  are 
filled  in  during  (2).  In  (3)  the  topology  of  the  structure  may  be  modified  to 
reflect  transformations  made  to  the  program;  routines  from  (2)  may  be  called  to 
update  the  information  slots.  In  (4)  a new  data  structure  is  contructed  from  the 
old  one,  radically  different  in  structure,  but  nevertheless  also  tree-like  in 
form.  During  (5)  information  is  added  to  slots  in  the  second  structure.  In  (6) 
this  information  is  used  to  produce  the  final  code. 
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Alpha-conversion  and  macro-expansion 

In  this  phase  a copy  of  the  user  program  is  made.  The  user  program  is 
conceptually  a tree  structure;  each  node  is  one  of  several  kinds  of  construct 
(constant,  variable,  LAMBDA-expression,  IF-expression,  combination,  etc.).  Some 
kinds  of  nodes  have  subnodes;  for  example,  a LAMBDA-expression  node  has  a 
subnode  representing  the  body,  and  a combination  node  has  a subnode  for  each 
argument.  The  copying  is  performed  in  the  obvious  way  by  a recursive  tree-walk. 
In  the  process  all  bound  variables  are  renamed.  Each  bound  variable  is  assigned 
a new  generated  name  at  the  point  of  binding,  and  each  node  for  a reference  to  a 
bound  variable  contains  this  generated  name,  not  the  original  name.  From  this 
point  on  all  variables  are  dealt  with  in  terms  of  their  new  names.  (This  is 
possible  because,  as  a consequence  of  lexical  scoping,  we  can  identify  all 
references  to  each  bound  variable.)  These  new  names  are  represented  as  atomic 
symbols,  and  the  property  lists  of  these  symbols  will  later  be  used  to  store 
information  about  the  variables. 

As  each  subform  of  the  user  program  is  examined,  a check  is  made  for  a 
macro  call,  which  is  a list  whose  car  is  an  atomic  symbol  with  one  of  several 
macro-defining  properties.  When  such  a call  is  encountered,  the  macro  call  is 
expanded,  and  the  tree-walk  is  resumed  on  the  code  returned  by  the  expansion 


process . 
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B.  Preliminary  analysis 

The  preliminary  analysis  ("phase  1”)  is  in  three  passes,  each  involving  a 
tree-walk  of  the  node  structure,  filling  in  information  slots  at  each  node.  (Two 
passes  would  have  sufficed,  but  for  reasons  of  clarity  and  modularity  there  is 
one  pass  for  each  type  of  analysis.) 

The  first  pass  (ENV-ANALYZE)  analyzes  variable  references.  For  each  node 
we  determine  the  set  of  all  local  (bound)  variables  referenced  at  or  below  that 
node.  For  example,  for  a variable-reference  node  this  set  is  empty  (for  a global 
variable),  or  the  singleton  set  of  the  variable  itself  (for  a local  variable); 
for  a LAMBDA-expression,  it  is  the  set  for  its  body  minus  the  variables  bound  by 
that  LAMBDA-expression;  for  an  IF-expression,  it  is  the  union  of  the  sets  for 
the  predicate,  consequent,  and  alternative;  and  so  on.  We  also  compute  for  each 
node  the  set  of  bound  variables  which  appear  in  an  ASET'  at  or  below  the  node. 
(This  set  will  be  a subset  of  the  first  set,  but  no  non-trivial  use  of  this 
property  is  used  in  this  pass.)  Finally,  for  each  variable  we  store  several 
properties  on  its  property  list,  including  a list  of  all  nodes  which  reference 
the  variable  (for  "reading")  and  a list  of  of  all  ASET*  nodes  which  modify  the 
variable.  These  lists  are  built  incrementally,  with  an  entry  added  as  each 
reference  is  encountered  during  the  tree  walk.  (This  exemplifies  the  general 
strategy  for  passing  data  around;  any  Information  which  cannot  be  passed 
conveniently  up  and  down  the  tree,  but  which  must  jump  laterally  between 
branches,  is  accumulated  on  the  property  lists  of  the  variables.  It  may  appear 
to  be  "lucky"  that  all  such  information  has  to  do  with  variables,  but  this  is 
actually  an  extremely  deep  property  of  our  notation.  The  entire  point  of  using 
identifiers  is  to  relate  textually  separated  constructions.  We  depend  on  alpha- 
conversion  to  give  all  variables  distinct  names  (by  "names"  we  really  mean 
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"compile-time  data  structures")  so  that  the  Information  for  variables  which  the 
user  happened  to  give  the  same  name  will  not  be  confused.) 

The  second  pass  (TRIV-ANALYZE)  locates  "trivial"  portions  of  the  program. 
(Cf.  [Wand  and  Friedman].)  Constants  and  variables  are  trivial;  an  IF- 
expressions  is  trivial  iff  the  predicate,  consequent,  and  alternative  are  all 
trivial;  an  ASET*  is  trivial  iff  its  body  is  trivial;  a combination  is  trivial 
iff  the  function  is  either  a global  variable  which  is  the  name  of  a MacLISP 
primitive,  or  a LAMBDA-expression  whose  body  is  trivial,  and  the  arguments  are 
all  trivial.  LAHBDA-expressions,  LABELS-forms  (which  contain  LAMBDA- 
expressions) , and  CATCH-forms  are  never  trivial.  The  idea  is  that  a trivial 
expression  is  one  that  HacLISP  could  evaluate  itself,  without  benefit  of  SCHEME 
control  structures.  (No  denigration  of  MacLISP's  ability  is  intended  by  this 
terminology! ) Note  particularly  the  two  special  cases  of  combinations 
distinguished  here  (in  which  the  function  position  contains  either  the  name  of  a 
MacLISP  primitive  or  a LAMBDA-expression) ; they  are  very  important,  and  shall  be 
referred  to  respectively  as  TRIVFN-combinations  and  LAMBDA-combinations. 

The  third  pass  (EFFS-ANALYZE)  analyzes  the  possible  side-effects  caused 
by  each  node,  and  the  side-effects  which  could  affect  it.  It  actually  produces 
two  sets  of  analyses,  one  liberal  and  one  conservative.  Where  there  is  any 
uncertainty  as  to  what  side-effects  may  be  involved,  it  assumes  none  in  one  case 
and  all  possible  in  the  other.  The  liberal  estimation  is  used  only  to  issue 
error  messages  to  the  user  about  possible  conflicts  which  might  result  as  a 
consequence  of  the  freedom  to  evaluate  arguments  to  combinations  in  any  order. 
The  user  is  given  the  benefit  of  doubt,  and  warned  only  of  a "provable"  conflict. 
(Actually,  the  "proof"  is  a little  sloppy,  and  can  err  in  both  directions,  but  in 
practice  it  has  issued  no  false  alarms  and  a number  of  helpful  warnings.)  The 
conservative  estimation  is  used  by  the  optimizer,  which  will  move  expressions 
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only  if  it  can  prove  that  there  will  be  no  conflict. 

* 

Side  effects  are  grouped  into  classes:  ASET,  RPLACA  and  RPLACD  (which 
are  considered  distinct),  FILE  (input/output  operations),  and  CONS.  These  are 
not  intended  to  be  exhaustive;  there  is  also  an  internal  notation  for  "any  side- 
effect  whatever".  The  use  of  classes  enables  the  analysis  to  realize,  for 
example,  that  RPLACA  cannot  affect  the  value  of  a variable  per  se.  There  is  a 
moderately  large  body  of  data  in  RABBIT  about  the  side-effects  of  MacLISP 
primitive  functions.  For  example,  CAR,  CDR,  CAAR,  CADR,  and  so  on  are  known  not 
to  have  side-effects,  and  to  be  respectively  affected  only  by  RPLACA,  RPLACD, 
RPLACA,  RPLACA  or  RPLACD,  and  so  on.  Similarly,  RABBIT  knows  that  ASET'  affects 
the  values  of  variables,  but  cannot  affect  the  outcome  of  a CAR  operation.  (It 
may  affect  the  value  of  the  expression  (CAR  X),  but  only  because  a variable 
reference  is  a subnode  of  the  combination.  The  effects,  or  affectability,  of  a 
combination  are  the  union  of  the  effects,  or  affectibility,  of  all  arguments  plus 
those  of  the  function.)  The  CONS  side-effect  is  a special  case.  This  side- 
effect  cannot  affect  anything,  and  two  instances  of  it  may  be  performed  in  the 
"wrong"  order,  but  performing  a single  instance  twice  will  produce  distinct  (as 
determined  by  EQ)  and  therefore  incorrect  results.  In  particular,  closures  of 
LAMBDA-expressions  involve  the  CONS  side-effect.  (The  definition  of  SCHEME  says 
nothing  about  whether  EQ  is  a valid  operation  on  closures,  but  in  general  it  is 
not  a good  idea  to  produce  unnecessary  multiple  copies.)  On  the  other  hand, 
LAMBDA-expressions  occurring  in  function  position  of  a LANBDA-combination  do  not 
incur  the  CONS  side-effect.  The  CONS  side-effect  is  given  special  treatment  in 
the  optimizer.  (Note  Side-Effect  Classifications) 
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C.  Optimization 

Once  the  preliminary  analysis  is  done,  the  optimization  phase  performs 
certain  transformations  on  the  code.  The  result  is  an  equivalent  program  which 
will  (probably)  compile  into  more  efficient  code.  This  new  program  is  itself 
structurally  a valid  SCHEME  program;  that  is,  all  transformations  are  contained 
within  the  language.  The  transformations  are  thus  similar  to  those  performed  on 
macro  calls,  consisting  of  a syntactic  rewriting  of  an  expression,  except  that 
the  situations  where  such  transformations  are  applicable  are  more  easily 
recognized  in  the  case  of  macro  calls.  It  should  be  clear  that  the  optimizer  and 

the  macro-functions  are  conceptually  at  the  same  level  in  that  they  may  be 

written  in  the  same  meta-language  that  operates  on  representations  of  SCHEME 
programs.  {Note  Non-deterministic  Optimization) 

The  simplest  transformation  is  that  a combination  whose  function  position 
contains  a LAMBDA-expression  and  which  has  no  arguments  can  be  replaced  by  the 
body  of  the  LAMBDA-expression: 

((LAMBDA  ()  body))  = > body 

Another  is  that,  in  the  case  of  a LAMBDA- combination,  if  some  parameter  of  the 

LAMBDA-expression  is  not  referenced  and  the  corresponding  argument  can  be  proved 

to  have  no  side-effects  (with  an  exception  discussed  below),  then  the  parameter 
and  argument  can  be  eliminated: 

((LAMBDA  (xl  x2  x3)  body)  al  aZ  a3) 

=>  ((LAMBDA  (xl  x3)  body)  al  a3) 

if  x2  is  unreferenced  in  body  and  a2  has  no  side-effects 

Repeated  applications  of  this  rule  can  lead  to  the  preceding  case. 

A third  rule  is  that,  in  a LAMBDA- combination,  an  argument  can  be 
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substituted  for  one  or  more  occurrences  of  a parameter  in  the  body  of  the  LAMBDA- 
expression.  (This  rule  is  related  to  the  view  of  LAMBDA  as  a renaming  operator 
discussed  in  [Declarative],  and  together  with  the  two  preceding  rules  make  up  the 
rule  of  beta-conversion.)  Such  a substitution  is  permissible  only  if  (a)  either 
the  parameter  is  referred  to  only  once  or  the  argument  has  no  side  effects,  and 
(b)  the  substitution  will  not  alter  the  order  in  which  expressions  are  evaluated 
in  such  a way  as  to  allow  possible  side-effects  to  produce  different  results. 

Before  performing  the  substitution  it  is  necessary  to  show  that  side-effects  will 

\ 

not  interfere  in  this  manner.  This  issue  is  discussed  In  [Allen  and  Cocke], 

[Geschke],  and  [Wulf],  and  characterized  more  accurately  in  [Standish].  There  is 
also  some  difficulty  if  the  parameter  appears  in  an  ASET'.  Presently  RABBIT  does 
not  attempt  any  form  of  substitution  for  such  a parameter.  (ASET'  is  so  seldom 
used  in  SCHEME  programs  that  this  restriction  makes  very  little  difference.) 

This  third  rule  creates  an  exception  to  the  second.  If  an  argument  with 
a side  effect  is  referred  to  once,  and  is  substituted  for  the  reference,  then  the 
second  rule  must  be  invoked  to  eliminate  the  original  occurrence  of  the  argument, 
so  that  the  side  effect  will  not  occur  twice.  This  requires  a little  collusion 
between  the  two  rules. 

Even  if  such  a substitution  is  permissible,  it  is  not  always  desirable; 
time/space  tradeoffs  are  involved.  The  current  heuristic  is  that  a substitution 
is  desirable  if  (1)  the  parameter  is  referred  to  only  once;  or  (Z)  the  argument 
to  be  substituted  in  is  a constant  or  variable;  or  (3)  the  argument  is  a LAMBDA- 
expression  whose  body  is  (3a)  a constant,  or  (3b)  a variable  reference,  or  (3c)  a 
combination  which  has  no  more  arguments  than  the  LAMBDA-expression  requires  and  < 

for  which  the  arguments  are  all  constants  or  variables.  This  heuristic  was 
designed  to  be  as  conservative  as  possible  while  handling  most  cases  which  arise 
from  typical  macro-expansions. 
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The  case  where  the  expression  substituted  for  a variable  is  a LAMBDA- 
expression  constitutes  an  instance  of  procedure  integration  [Allen  and  Cocke]. 
The  more  general  kind  of  procedure  integration  proposed  in  [Declarative],  which 
would  involve  block  compilation  of  several  user  functions,  and  possibly  also  user 
declarations  or  data  type  analysis,  has  not  been  implemented  yet. 

It  would  be  possible  to  substitute  a LAMBDA-expression  for  a variable 
reference  in  the  case  of  a variable  bound  by  a LABELS.  This  might  be  useful  in 
the  case  of  a LABELS  produced  by  a simple-minded  PROG  macro,  which  produced  a 
labelled  function  for  each  statement  of  the  PROG;  in  such  a case  most  labelled 
functions  would  be  referred  to  only  once.  We  have  not  implemented  this  yet  in 
RABBIT.  {Note  Loop  Unrolling) 

Currently  there  is  not  any  attempt  to  perform  the  inverse  of  beta- 
conversion.  This  process  would  be  that  of  locating  common  subexpressions  of  some 
single  large  expression,  making  that  large  expression  the  body  of  a LAMBDA- 
expression  of  one  parameter,  replacing  all  occurrences  of  the  common 
subexpression  by  a reference  to  the  parameter,  and  replacing  the  large  expression 
by  a combination  whose  function  position  contained  the  LAMBDA-expression  and 
whose  argument  was  a copy  of  the  common  subexpression.  More  generally,  several 
common  subexpressions  could  be  isolated  at  once  and  made  into  several  parameters 
of  the  LAMBDA-expression.  For  example,  consider: 

(LAMBDA  (ABC) 

(LIST  (/  (♦  (-  B)  (SQRT  (-  (*  B 2)  (*  4 A C)))) 

(*  2 A)) 

(/  (-  (-  B)  (SQRT  (-  (~  B 2)  <*  4 A C)))) 

<*  2 A)))) 

Within  the  large  expression  (LIST  ...)  we  might  detect  the  common  subexpressions 
(-  B),  (SQRT  ...),  and  (*  Z A).  Thus  we  would  invent  three  parameters  Ql,  Q2,  Q3 


and  transform  the  expression  into: 
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(LAMBDA  (A  B C) 

((LAMBDA  (Ql  Q2  Q3) 

(LIST  (/  (♦  Q1  Q2)  Q3) 

(/  (-  Q1  QZ)  Q3) ) ) 

(-  B) 

(SORT  (-  (*  B 2)  (*  4 A C))) 

(*  2 A))) 


(There  would  be  no  problem  of  conflicting  names  as  there  is  for  macro  rules, 
because  we  are  operating  on  code  for  which  all  variables  have  already  been 
renamed;  Ql,  QZ,  and  Q3  can  be  chosen  as  the  next  variables  in  the  renaming 
sequence . ) 

This  approach  doesn't  always  work  if  side-effects  are  present;  the 
abstracted  (!)  common  subexpression  may  be  evaluated  too  soon,  or  the  wrong 
number  of  times.  This  can  be  solved  by  wrapping  (LAMBDA  ()  •)  around  the  common 
subexpression,  and  replacing  references  by  a combination  instead  of  a simple 
variable  reference.  For  example: 


(IF  (HAIRYP  X) 

(BLOCK  (PRINT  '|Here  is  some  hair: | ) 
(PRINT  X) 

(PRINT  ' |End  of  hair.l )) 
(BLOCK  (PRINT  '|This  one  is  bald:|) 
(PRINT  X) 

(PRINT  'I End  of  baldness.!))) 


We  could  not  transform  it  into  this: 


((LAMBDA  (Ql) 

(IF  (HAIRYP  X) 

(BLOCK  (PRINT  '(Here  is  some  hair:|) 
Ql 

(PRINT  1 | End  of  hair.l)) 

(BLOCK  (PRINT  '|This  one  is  bald:|) 

Ql 

(PRINT  '| End  of  baldness.!)))) 

(PRINT  X)) 


because  x would  be  printed  before  the  appropriate  leading  message.  Instead,  we 
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f transform  it  into: 


((LAMBDA  (Ql) 

(IF  (HAIRYP  X) 

( BLOCK  (PRINT  '|Here  is  som  hair:|) 
(Ql) 

(PRINT  '| End  of  hair.|)) 

(BLOCK  (PRINT  '|This  one  is  bald:|) 
(Ql) 

. (PRINT  *| End  of  baldness. | )))) 
(LAMBDA  ()  (PRINT  X))) 


This  is  similar  to  the  call-by-name  trick  used  in  writing  macro  rules. 

A more  general  transformation  would  detect  nearly  common  subexpressions 
as  follows: 


((LAMBDA  (Ql) 

(IF  (HAIRYP  X) 

(Ql  '| Here  is  some  hair:| 

1 | End  of  hair. | ) 

(Ql  '| This  one  is  bald:| 

'| End  of  baldness. | ))) 

(LAMBDA  (Q2  Q3) 

(BLOCK  (PRINT  Q2)  (PRINT  X)  (PRINT  Q3)))) 


In  this  way  we  can  express  the  notion  of  subroutlnization . 

(Note  Subroutinization) 

We  point  out  these  possibilities  despite  the  fact  that  they  have  not  been 
implemented  in  RABBIT  because  the  problem  of  isolating  common  subexpressions 
seems  not  to  have  been  expressed  in  quite  this  way  in  the  literature  on 
compilation  strategies.  We  might  speculate  that  this  is  because  most  compilers 
which  use  complex  optimization  strategies  have  been  for  ALGOL-like  languages 
which  do  not  treat  functions  as  full-fledged  data  objects,  or  even  permit  the 
writing  of  "anonymous”  functions  in  functions  calls  as  LISP  does. 

RABBIT  does  perform  folding  on  constant  expressions  [Allen  and  Cocke]; 
that  is,  any  combination  whose  function  is  a side-effect-less  MacLISP  primitive 
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and  whose  arguments  are  all  constants  is  replaced  by  the  result  of  applying  the 
primitive  to  the  arguments.  There  is  presently  no  attempt  to  do  the  same  thing 
for  side-effect-less  SCHEME  functions,  although  this  is  conceptually  no  problem. 

Finally,  there  are  two  transformations  on  IF  expressions.  One  is  simply 
that  an  IF  expression  with  a constant  predicate  is  simplified  to  its  consequent 
or  alternative  (resulting  in  elimination  of  dead  code).  The  other  was  adapted 
from  [Standish],  which  does  not  have  this  precise  transformation  listed,  but 
gives  a more  general  rule.  In  its  original  form  this  transformation  is: 

(IF  (IF  a b c)  d e)  =>  (IF  a (IF  b d e)  (IF  c d e)) 

One  problem  with  this  is  that  the  code  for  d and  e is  duplicated.  This  can  be 
avoided  by  the  use  of  LAMBDA-expressions: 

((LAMBDA  (Q1  Q2) 

(IF  a 

(IF  b (Ql)  (QZ)) 

(IF  c (Ql)  (QZ)))) 

(LAMBDA  ()  d) 

(LAMBDA  ( ) e)) 

As  before,  there  is  no  problem  of  name  conflicts  with  Ql  and  QZ.  While  this  code 
may  appear  unnecessarily  complex,  the  calls  to  the  functions  Ql  and  QZ  will 
typically,  as  shown  above,  be  compiled  as  simple  GOTO's.  As  an  example,  consider 
the  expression: 

(IF  (AND  PRED1  PRED2)  (PRINT  'WIN)  (ERROR  ’LOSE)) 

Expansion  of  the  AND  macro  will  result  in: 

(IF  ((LAMBDA  (V  R)  (IF  V (R)  'NIL)) 

PRED1 

(LAMBDA  ()  PREDZ) ) 

(PRINT  'WIN) 

(ERROR  'LOSE)) 
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(For  expository  clarity  we  will  not  bother  to  rename  all  the  variables,  inasmuch 
as  they  are  already  distinct.)  Because  V and  R have  only  one  reference  apiece 
(and  there  are  no  possible  interfering  side-effects),  the  corresponding  arguments 
can  be  substituted  for  them. 


(IF  ((LAMBDA  (V  R)  (IF  PRED1  ((LAMBDA  ()  PRED2) ) 'NIL)) 
PRED1 

(LAMBDA  ()  PRED2) ) 

(PRINT  ’WIN) 

(ERROR  'LOSE)) 


Now,  since  V and  R have  no  referents  at  all,  they  and  the  corresponding  arguments 
can  be  eliminated,  since  the  arguments  have  no  side-effects. 


(IF  ((LAMBDA  ()  (IF  PRED1  ((LAMBDA  ()  PRED2 ) ) 'NIL))) 
(PRINT  'WIN) 

(ERROR  'LOSE)) 


Next,  the  combination  ((LAMBDA  ()  ...))  is  eliminated  in  two  places: 


(IF  (IF  PRED1  PRED2  'NIL) 
(PRINT  'WIN) 

(ERROR  'LOSE)) 


Now,  the  transformation  on  the  nested  IF's: 


((LAMBDA  (Q1  Q2) 

(IF  PRED1 

(IF  PRED2  (Ql)  (Q2) ) 
(IF  'NIL  (Ql)  (Q2)))) 
(LAMBDA  ()  (PRINT  'WIN)) 

(LAMBDA  ()  (ERROR  'LOSE))) 


Now  one  IF  has  a constant  predicate  and  can  be  simplified: 
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((LAMBDA  ( Q 1 QZ) 

(IF  PRED1 

(IF  PREDZ  (Ql)  (QZ)) 

<Q2))) 

(LAMBDA  ()  (PRINT  ’WIN)) 

(LAMBDA  ()  ( ERROR  ’LOSE))) 


The  variable  Ql  has  only  one  referent,  and  so  we  substitute  in,  eliainate  the 
variable  and  argument,  and  collapse  a ((LAMBDA  ()  . .)): 


((LAMBDA  (QZ) 

(IF  PRED1 

(IF  PREDZ  (PRINT  ’WIN)  (QZ)) 

(QZ))) 

(LAMBDA  ()  (ERROR  ’LOSE))) 


Recalling  that  (QZ)  is,  in  effect,  a GOTO  branching  to  the  common  piece  of  code, 
and  that  by  virtue  of  later  analysis  no  actual  closure  will  be  created  for  either 
LAMBDA -expression,  this  result  is  quite  reasonable.  {Note  Evaluation  for 
Control } 


D.  Conversion  to  Continuation-Passing  Style 

This  phase  is  the  real  meat  of  the  compilation  process.  It  is  of 
interest  primarily  in  that  it  transforms  a program  written  in  SCHEME  into  an 
equivalent  program  (the  continuation-passinq-style  version,  or  CPS  version), 
written  in  a language  isomorphic  to  a subset  of  SCHEME  with  the  property  that 
interpreting  it  requires  no  control  stack  or  other  unbounded  temporary  storage 
and  no  decisions  as  to  the  order  of  evaluation  of  (non-trlvial)  subexpressions. 
The  importance  of  these  properties  cannot  be  overemphasized.  The  fact  that  it  is 
essentially  a subset  of  SCHEME  implies  that  its  semantics  are  as  clean,  elegant, 
* and  well-understood  as  those  of  the  original  language.  It  is  easy  to  build  an 
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interpreter  for  this  subset,  given  the  existence  of  a SCHEME  interpreter,  which 
can  execute  the  transformed  program  directly  at  this  level.  This  cannot  be  said 
for  other  traditional  intermediate  compilation  forms;  building  an  interpreter 
for  triples  [Gries],  for  example,  would  be  a tremendous  undertaking.  The 
continuation-passing  version  expresses  all  temporary  intermediate  results 
explicitly  as  variables  appearing  in  the  program  text,  and  all  temporary  control 
structure  in  the  form  of  LAMBDA-expressions  (that  is,  closures).  It  is  explicit 
in  directing  the  order  of  operations;  there  is  no  non-trivial  freedom  at  any 
point  in  the  evaluation  process. 

As  a result,  once  the  CPS  version  of  a program  has  been  generated,  the 
remainder  of  the  compilation  process  is  fairly  easy.  There  is  a reasonably 
direct  correspondence  between  constructs  in  the  CPS  language  and  "machine- 
language"  operations  (if  one  assumes  CONS  to  be  a "machine-language"  primitive 
for  augmenting  environment  structure,  which  we  do).  The  later  passes  are 
complicated  only  by  the  desire  to  handle  certain  special  cases  in  an  optimal 
manner,  most  particularly  the  case  of  a function  call  whose  function  position 
contains  a variable  which  can  be  determined  to  refer  to  a known  LAMBDA- 
expression.  This  analysis  must  be  done  after  the  CPS  conversion  because  it 
applies  to  continuations  as  well  as  LAMBDA-expressions  written  by  the  user  or 
generated  by  macros. 

The  CPS  language  differs  from  SCHEME  in  only  two  respects.  First,  each 
primitive  function  is  different,  in  that  it  returns  no  value;  Instead,  it 
accepts  an  additional  argument,  the  continuation,  which  must  be  a "function"  of 
one  argument,  and  by  definition  invokes  the  continuation  tail-recursively,  giving 
it  as  an  argument  the  computed  "value"  of  the  primitive  function.  We  extend  this 
by  convention  to  non-primitive  functions,  and  so  aU  functions  are  considered  to 
take  a continuation  as  one  of  its  arguments  (by  convention  the  first  — this 
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differs  from  the  convention  used  in  the  examples  in  [SCHEME],  [Imperative],  and 
[Declarative]).  Continuations,  however,  do  not  themselves  take  continuations  as 
arguments . 

Second,  no  combination  may  have  a non-trivial  argument.  In  strict 
continuation-passing  style  (as  described  in  note  (Evalorder)  of  [Imperative]), 
this  implies  that  no  combination  can  have  another  combination  as  an  argument,  or 
an  IF-expression  with  a non-trivial  consequent  or  alternative,  etc.  We  relax 
this  to  allow  as  arguments  any  trivial  form  in  the  sense  described  above  for  the 
preliminary  triviality  analysis.  We  note  that,  in  principle,  trivial  expressions 
require  no  unbounded  space  on  the  part  of  the  SCHEME  interpreter  to  evaluate,  and 
that  the  compiler  need  not  worry  about  control  and  environment  issues  for  trivial 
expressions.  (Trivial  expressions  do  require  unbounded  space  on  the  part  of  the 
MacLISP  run-time  system,  because  the  point  of  the  triviality  analysis  is  that 
trivial  expressions  can  be  handled  by  MacLISP!  The  question  of  what  should  be 
considered  trivial  is  actually  a function  of  the  characteristics  of  our  target 
machine.  We  note  that,  at  the  least,  variables,  constants,  and  LAMBDA- 
expressions  should  be  considered  trivial.  That  the  preliminary  triviality 
analysis  does  not  consider  LAMBDA-expressions  trivial  is  a trick  so  that  all 
closures  will  be  processed  by  the  CPS-conversion  process,  and  the  fact  that  we 
call  it  a triviality  analysis  is  a white  lie.  See,  however, 
[Wand  and  Friedman].) 

The  effect  of  the  restriction  on  combinations  is  startling.  On  the  one 
hand,  they  do  not  so  constrain  the  language  as  to  be  useless;  on  the  other  hand, 
they  require  a radically  different  approach  to  the  expression  of  algorithms.  It 
is  easy  to  see  that  no  control  stack  is  necessary  to  evaluate  such  code,  for,  as 
mentioned  in  [SCHEME],  control  stack  is  used  only  to  keep  track  of  intermediate 
values  and  return  addresses,  and  these  arise  only  in  the  case  of  combinations 
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with  non-trivial  arguments,  and  conditionals  with  non-trivial  predicates. 

An  algorithm  for  converting  SCHEME  programs  to  continuation-passing  style 
was  given  in  Appendix  A of  [Declarative].  (Note  Old  CPS  Algorithm)  The  one  used 
in  RABBIT  is  almost  identical,  except  that  for  the  convenience  of  the  code- 
generation phase  a distinction  is  made  between  ordinary  LAMBDA-expressions  and 
continuations,  and  between  combinations  used  to  invoke  "functions”  and  those  used 
to  invoke  continuations.  These  sets  can  in  fact  be  consistently  distinguished, 
and  it  affords  a certain  amount  of  error-checking;  for  example,  a LAMBDA- 
expression  should  never  appear  in  the  "function"  position  of  a continuation- 
invoking combination.  Another  fine  point  is  that  ASET'  can  never  be  applied  to  a 
variable  bound  by  a continuation.  Except  for  such  differences  arising  from  their 
uses,  the  two  sets  of  constructs  are  treated  more  or  less  identically  in  later 
phases.  An  additional  difference  between  the  algorithm  in  [Declarative]  and  the 
one  in  RABBIT  is  that  trivial  subforms  are  treated  as  single  nodes  in  the  CPS 
version;  these  nodes  have  pointers  to  the  non-CPS  versions  of  the  relevant  code, 
which  are  largely  ignored  by  later  processing  until  the  final  code  is  to  be 
generated. 

It  must  be  emphasized  that  there  is  not  necessarily  a unique  CPS  version 
for  a given  SCHEME  program;  there  is  as  much  freedom  as  there  is  in  the  original 
program  to  re-order  the  evaluation  of  subexpressions.  In  the  course  of  the 
conversion  process  decisions  must  be  made  as  to  what  order  to  use  in  evaluating 
arguments  to  combinations.  The  current  decision  procedure  is  fairly  simple- 
minded,  consisting  mostly  of  not  making  copies  of  constants  and  the  values  of 
variables.  The  point  here,  as  earlier,  is  not  so  much  that  RABBIT  has  a much 
better  algorithm  than  other  compilers  as  that  it  has  a far  cleaner  way  of 
expressing  the  result.  (For  a complex  decision  procedure  for  argument  ordering, 
see  [Wulf  ]. ) {Note  Non-deterministic  CPS  Conversion) 


Environment  and  closure  analysis 
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This  phase  consists  of  four  passes  over  the  CPS  version  of  the  program. 
As  with  the  earlier  preliminary  analysis,  each  pass  determines  one  related  set  of 
information  and  attaches  this  information  to  nodes  of  the  program  tree  and  to 
property  lists. 

The  first  pass  (CENV-ANALYZE)  analyzes  variable  references  for  the  CPS 
version  in  a manner  similar  to  that  of  the  first  pass  of  the  preliminary 
analysis.  The  results  of  this  previous  analysis  are  used  here  in  the  case  of 
trivial  expressions;  with  this  exception  the  analysis  is  redone  completely, 
because  additional  variables  are  introduced  by  the  CPS  conversion.  (None  of 
these  new  variables  can  appear  in  an  ASET1,  however,  and  so  the  analysis  of 
written  variables  need  not  be  done  over.)  In  addition,  for  each  variable 
reference  which  does  not  occur  in  the  function  position  of  a combination,  we  mark 
that  variable  with  a non-nil  VARIABLE-REFP  property,  used  later  to  determine 
whether  closures  need  to  be  created  for  known  functions. 

The  second  pass  (BIND-ANALYZE)  determines  for  each  LAHBDA-expression 
whether  a closure  will  be  needed  for  it  at  run  time.  There  are  three 
possibilities: 

(1)  If  the  function  denoted  by  the  LANBDA-expression  is  bound  to  some 
variable,  and  that  variable  is  referenced  other  than  in  function  position, 
then  the  closure  is  being  treated  as  data,  and  must  be  a full  (standard  CBETA 
format)  closure.  If  the  function  itself  occurs  in  non-function  position  other 
than  in  a LAMBDA-combination,  it  must  be  fully  closed. 

(2)  If  the  closure  is  bound  to  some  variable,  and  that  variable  is  referenced 
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only  in  function  position,  but  some  of  these  references  occur  within  other 
partially  or  fully  closed  functions,  then  this  function  must  be  partially 
closed.  By  this  we  mean  that  the  environment  for  the  closure  must  be  "consed 
up",  but  no  pointer  to  the  code  need  be  added  on  as  for  a full  closure.  This 
function  will  always  be  called  from  places  that  know  the  name  of  the  function 
and  so  can  just  perform  a GO  to  the  code,  but  those  such  places  which  are 
within  closures  must  have  a complete  copy  of  the  necessary  environment. 

(3)  In  other  cases  (functions  bound  to  variables  referenced  only  in  function 
position  and  never  within  a closed  function,  or  functions  occurring  in 
function  position  of  LAMBDA-combinations),  the  function  need  not  be  closed. 
This  is  because  the  environment  can  always  be  fully  recovered  from  the 
environment  at  the  point  of  call. 

In  order  to  determine  this  information,  it  is  necessary  to  determine,  for 
each  node,  the  set  of  variables  referred  to  from  within  closed  functions  at  or 
below  that  node.  Thus  this  process  and  the  process  of  determining  which 
functions  to  close  are  highly  interdependent,  and  so  must  be  accomplished  in  a 
single  pass. 

The  second  pass  also  generates  a name  for  each  LAMBDA-expression  (to  be 
used  as  tags  in  the  output  code,  as  discussed  in  the  examples  earlier),  and  for 
non-closed  functions  determines  which  variables  will  be  assigned  to  "registers" 
or  "memory  locations".  For  these  non-closed  functions  it  may  determine  that 
certain  variables  need  not  be  assigned  locations  at  all  (they  are  never 
referenced,  or  are  bound  to  other  non-closed  functions  — the  latter  circumstance 
is  important  when  a variable  is  known  to  denote  a certain  function,  but  the 
optimizer  was  too  conservative  to  perform  beta-substitution  for  fear  of 
duplicating  code  and  thus  wasting  space).  Finally,  for  each  variable  which  is 
(logically,  at  run  time  not  necessarily  actually)  bound  to  a known  function  (and 
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which  never  appears  in  an  ASET'),  a property  KNOWN-FUNCTION  is  put  on  its 
property  list  whose  value  is  the  node  of  the  CPS  version  of  that  function.  This 
property  is  used  later  in  generating  code  for  combinations  in  whose  function 
positions  such  variables  appear. 

The  third  pass  (DEPTH-ANALYZE)  examines  each  LAflBDA-expression  and 
determines  the  precise  registers  or  memory  locations  through  which  arguments  are 
to  be  passed  to  each.  Closed  functions  take  their  arguments  in  the  standard 
registers  described  earlier;  non-closed  functions  may  take  their  arguments  in 
any  desired  places.  (Partially  closed  functions  could  also,  but  there  is  little 
advantage  to  this.)  The  allocation  strategy  in  RABBIT  for  non-closed  functions 
is  presently  merely  stack-like;  the  deeper  the  nesting  of  a function,  the  higher 
in  the  ordering  of  "registers"  and  "memory  locations"  are  the  locations  assigned. 
(See  e.g.  [Johnsson]  for  a detailed  analysis  of  the  register  allocation  problem.) 

The  fourth  pass  (CLOSE-ANALYZE)  determines  the  precise  format  of  the 
environment  to  be  constructed  for  each  closure.  That  is,  while  the  third  pass 
handles  cases  for  which  stack-allocation  of  environments  will  suffice,  the  fourth 
pass  deals  with  heap-allocated  environment  structures.  Recall  that  the  format  of 
an  environment  can  be  completely  arbitrary,  since  the  only  code  which  can 
possibly  refer  to  an  environment  is  the  function  for  a closure  of  which  the 
environment  was  created.  Therefore  the  compiler  which  compiles  that  function  has 
a free  hand  in  determining  the  structure  of  the  environment.  For  the  sake  of 
simplicity,  RABBIT  chooses  to  generate  code  which  represents  environments  simply 
as  a list  of  variable  values.  Several  environment  lists  may  share  a common  tail. 
The  environment  for  a closure  need  not  contain  any  variables  not  needed  by  the 
closed  function,  but  it  may  if  this  will  allow  the  sharing  of  a single  structure 
among  several  closures.  (There  is  a problem  with  variables  modified  by  ASET' 
which  is  discussed  in  the  next  paragraph.) 
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For  each  LAMBDA-expression  which  must  be  closed,  three  sets  of  variables 
are  computed:  (1)  the  variables  which  will  already  be  in  the  "consed" 
environment  structure  at  the  time  the  closure  is  to  be  created;  (2)  additional 
variables  which  must  be  added  ("consed  on")  to  the  existing  structure  to  create 
the  closure  (because  at  that  point  they  are  spread  out  in  "registers”)  (Note 
Heap-Allocated  Contours);  (3)  variables  which  must  be  added  to  the  environment 
immediately  after  entering  the  function  because  they  must  eventually  be  added  In 
for  closures  later  and  they  are  referred  to  in  ASET'  constructs.  The  third  set 
arises  from  a requirement  that  ASET'  constructs  must  have  a consistent  effect, 
and  confusion  can  arise  if  a variable's  value  can  be  in  more  than  one  place.  If 
the  value  were  allowed  to  be  both  in  a "register"  and  in  an  environment 
structure,  or  in  several  different  environment  structures,  then  altering  the 
value  in  one  place  would  not  affect  the  other  places.  To  assure  consistency, 
this  third  set  is  computed,  and  such  variables  must  at  run  time  be  placed  in  an 
environment  structure  to  be  shared  by  all  others  which  refer  to  such  variables. 

For  every  LABELS  statement  a set  of  variables  is  computed  which  is  the 
set  of  variables  to  be  added  to  the  existing  environment  on  entry  to  the  LABELS 
body,  in  order  to  share  this  new  structure  among  all  the  closures  to  be  created 
for  the  LABELS  functions. 


64 


F.  Code  generation 

Given  the  foregoing  analysis,  the  generation  of  code  is  straightforward, 
and  largely  consists  of  using  the  information  already  gathered  to  detect  special 
cases.  The  special  cases  of  interest  relate  almost  entirely  to  function  calls 
and  closures  (indeed,  there  is  little  else  in  the  language  for  RABBIT'S 
purposes! ) . 

RABBIT  has  provision  for  "block  compiling"  a number  of  functions  into  a 
single  module.  This  permits  an  optimization  in  which  one  function  can  transfer 
control  directly  to  another  without  going  through  the  "UUO  handler”.  Even  if 
several  user  functions  are  not  compiled  into  a single  module,  this  is  still  of 
advantage,  because  a single  user  function  can  produce  a large  number  of  output 
functions,  as  a consequence  of  the  code-generation  techniques. 

A module  consists  of  a single  FlacLISP  function  whose  body  is  a single 
PROG.  This  PROG  has  no  local  variables,  but  does  have  a number  of  tags,  one  for 
each  function  in  the  module.  On  entry  to  the  module,  the  register  **ENV**  will 
contain  the  "environment"  for  the  function  to  be  executed.  As  noted  above,  the 
format  of  this  is  arbitrary.  For  functions  compiled  by  RABBIT,  this  is  a list 
whose  car  is  a tag  within  the  PROG  and  whose  cdr  is  the  "real  environment". 
(Note  Code  Pointers)  At  the  beginning  of  the  PROG  there  is  always  the  code 

(GO  ( PR0G2  NIL  (CAR  **ENV**) 

(SETQ  **ENV**  (CDR  **ENV**)))) 

the  effect  of  which  is  to  put  the  "real  environment"  in  **ENV**  and  then  perform 
a computed  GO  to  the  appropriate  tag.  (This  is  the  only  circumstance  in  which 
the  FlacLISP  PR0G2  and  computed  GO  constructs  are  used  by  RABBIT-compiled  code. 
Either  could  be  eliminated  at  the  expense  of  more  bookkeeping,  the  former  by 
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using  a temporary  intermediate  variable,  the  latter  by  using  a giant  COND  with 
non-computed  GO  statements  (which  is  effectively  how  the  MacLISP  compiler 
compiles  a computed  GO  anyway).  As  always,  such  trivial  issues  are  left  to  the 
MacLISP  compiler  when  they  do  not  bear  on  the  issues  of  interest  in  compiling 
SCHEME  code.)  For  small  functions,  often  the  "main  entry  point"  is  the  only 
closed  function,  and  it  would  be  possible  to  eliminate  the  computed  GO,  but 
RABBIT  always  outputs  one,  because  is  is  cheap  and  provides  a useful  error  check. 

Once  the  computed  GO  has  been  performed,  the  code  following  the  tag  is 
responsible  for  performing  its  bit  of  computation  and  then  exiting.  It  may  exit 
by  setting  the  **FUN**  register  to  another  function,  setting  up  appropriate 
argument  registers,  and  then  doing  (RETURN  NIL)  to  exit  the  module  and  enter  the 
UUO  handler;  or  it  may  exit  by  directly  transferring  control  to  another  function 
within  the  module  by  performing  a GO  to  the  appropriate  tag,  after  setting  up  the 
arguments  and  **ENV**.  In  the  latter  case  the  arguments  may  actually  be  passed 
through  "memory  locations"  rather  than  the  standard  "registers".  (Conceptually, 
in  this  optimized  case  the  environment  needed  for  the  function  being  called  is 
being  passed,  not  in  **ENV**,  but  spread  out  in  those  registers  and  locations 
lower  than  those  being  used  to  pass  the  arguments.) 

Starting  with  the  CPS  version  of  one  or  more  user  functions,  the 
generation  of  the  code  for  a module  proceeds  iteratively.  Code  for  each  function 
i is  generated  in  turn,  producing  one  segment  of  code  and  a tag;  this  tag  and  code 

will  become  part  of  the  body  of  the  module.  In  processing  a function,  other 
functions  may  be  encountered;  in  general,  each  such  function  is  added  to  the 
list  of  outstanding  functions  for  the  module,  and  is  replaced  by  code  to  generate 
a closure  for  that  function.  When  all  functions  have  been  processed,  the  outer 
structure  of  the  module  is  created. 


Many  situations  are  treated  specially.  For  example, 
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((LAMBDA  j 

I 

does  not  cause  the  LAMBDA-expression  to  be  added  to  the  list  of  outstanding 
functions;  rather,  a MacLISP  PROGN  is  constructed  consisting  of  the  argument 
set-up  followed  by  the  code  for  the  body  of  the  LAMBDA-expression.  A more  subtle 

case  is  J 

| 

(FOO  (LAMBDA  . ..)  ...) 

where  FOO  is  the  name  of  a MacLISP  primitive  and  the  LAMBDA-expression  is  the  \ 

continuation.  In  this  case  a PROGN  is  constructed  consisting  of  calling  the 

MacLISP  primitive  on  the  other  arguments,  putting  this  value  into  the  appropriate  < 

location,  and  then  executing  the  body  of  the  LAMBDA-expression.  (It  should  be 

noted  that  all  these  special  cases  must  be  anticipated  by  the  analysis  preceding 

the  code  generation  phase.) 

In  the  case  of  ((LAMBDA  ...)  ...),  we  must  also  handle  the  argument  set- 

up a little  carefully,  because  parameters  which  are  never  referred  to  or  which 
represent  known  non-closed  functions  need  not  actually  be  passed.  However,  the 
corresponding  argument  for  the  first  case  must  nevertheless  be  evaluated  because 
it  may  have  a side-effect.  A good  example  is  the  result  of  expanding  BLOCK 
(neglecting  the  effects  of  optimization):  there  is  a (continuation-passing 
style)  combination  of  the  form: 

((LAMBDA  (C  A B)  (B  C))  cont  x (LAMBDA  (C)  y)) 

The  argument  x need  not  be  passed,  but  presumably  has  a side  effect  and  so  must 
be  evaluated.  The  second  LAMBDA-expression  reed  not  be  closed,  and  so  requires 
neither  evaluation  nor  passing.  The  output  code  uses  a PROGN  to  evaluate  the 
arguments  which  are  potentially  for  effect.  In  this  way  the  end  result  of  a 
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BLOCK  construct  actually  turns  out  to  be  a NacLISP  PROGN.  (The  routine 
LAMBDACATE  in  the  Appendix  is  responsible  for  this  analysis.) 

(Note  Evaluation  for  Effect) 

Another  case  of  interest  is  a combination  whose  function  position 

contains  a variable  with  a KNOWN-FUNCTION  property.  The  value  of  this  property 
is  the  node  for  the  CPS  version  of  the  function,  which  provides  information  about 
pos  * code  generation  strategies.  We  can  decide  which  arguments  needn't  be 
passed  as  for  the  ( ( LAMBDA  ...)  ...)  case,  and  can  also  arrange  to  call  the 

function  with  a direct  (NacLISP)  GO  to  the  appropriate  tag  within  the  module. 
The  set-up  of  the  environment  depends  on  whether  the  function  is  non-closed  or 
partially  closed;  in  the  latter  case  the  partial  closure  is  the  environment,  and 
in  the  former  the  environment  can  be  recovered  from  the  current  one  (and  may  even 
be  the  same). 

A certain  amount  of  "peephole  optimization"  [McKeeman]  is  also  performed, 
primarily  to  make  it  easier  for  people  to  inspect  the  code  produced,  since  the 
NacLISP  compiler  will  handle  them  anyway.  Examples  of  these  are  avoiding  the 
generation  of  SETQ  of  a variable  to  the  value  of  that  same  variable;  reduction 
of  car-cdr  chains  to  single  functions,  such  as  (CAR  (CDR  (CDR  x)))  to  (CADDR  x); 
removal  of  nested  PROGN’ s such  as 

(PROGN  a (PROGN  b c)  d)  =>  (PROGN  abed) 
and  the  like;  and  simplification  of  nested  COND's,  such  as 

(COND  (a  b)  (COND  (a  b) 

(T  (COND  (c  d)  =>  (c  d) 

...)))  ...) 

One  of  the  effects  of  this  last  peephole  optimization  is  that  many  times,  when 
the  user  writes  a COND  in  a piece  of  SCHENE  code,  that  COND  is  expanded  into  IF 
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constructs,  and 
equivalent  COND! 


then  re-contracted  by  the  peephole  optimization  into  an 
(This  fact  is  of  no  practical  consequence,  but  looks  cute.) 
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9.  Example:  Compilation  of  Iterative  Factorial 

Here  we  shall  provide  a complete  example  of  the  compilation  of  a simple 
function  IFACT  (iterative  factorial),  to  show  what  quantities  are  computed  in  the 
course  of  analyzing  the  code.  We  shall  need  some  notation  for  the  data 
structures  involved.  Every  node  of  the  program  is  represented  by  a small  data 
structure  which  has  a type  and  several  named  components.  (In  the  actual 
implementation,  a node  is  represented  as  two  such  structures;  one  contains  named 
components  common  to  all  program  nodes,  and  the  other  contains  components 
specific  to  a given  node  type.  We  shall  gloss  over  this  detail  here.)  For 
example,  a LAMBDA-expression  is  represented  by  a structure  of  type  LAMBDA  with 
components  named  UVARS  (user  variable  names),  VARS  (the  alpha-converted  names), 
BODY  (the  node  representing  the  body),  ENV  (the  environment  of  the  node),  and  so 
on.  We  shall  represent  a data  structure  as  the  name  of  its  type,  with  the 
components  written  below  it  and  indented,  with  colons  after  each  component  name. 
For  example: 

LAMBDA 


UVARS : 

(A  B) 

VARS: 

(VAR-43  VAR-44) 

BODY: 

COMBINATION 

AR6S:  VARIABLE 

VAR: 

F 

VARIABLE 

VAR: 

VAR-44 

VARIABLE 

VAR: 

VAR-43 

Notice  that  the  value  of  a component  may  itself  be  a structure.  These  structures 
are  always  arranged  in  a tree,  so  no  notation  for  cycles  will  be  needed.  In  the 
case  where  a component  contains  a list  of  things,  we  will  write  the  things  as  a 
LISP  list  unless  the  things  are  structures,  in  which  case  we  will  simply  write 
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them  in  a vertical  stack,  as  shown  in  the  example  above.  To  conserve  space,  in 
any  single  diagram  we  will  show  only  the  named  components  of  interest. 
Components  may  seem  to  appear  and  then  disappear  in  the  series  of  diagrams,  but 
in  practice  they  all  exist  simultaneously. 

The  source  code  for  our  example: 


(DEFINE  IFACT 

(LAMBDA  (N) 

(LABELS  ((F  (LAMBDA  (M  A) 

(IF  (=  M 0)  A 

(F  (-  H 1)  (*  M A)))))) 

(F  N 1)))) 


The  alpha-conversion  process  copies  the  program  and  produces  a tree  of 
structures.  All  the  bound  variables  are  renamed,  and  VARIABLE  nodes  refer  to 
these  new  names.  The  GLOBALP  component  in  a VARIABLE  node  is  non-NIL  iff  the 
reference  is  to  a global  variable.  The  ENV  component  is  simply  an  a-list 
relating  the  user  names  of  variables  to  the  new  names;  this  a-list  is  computed 
during  the  conversion  as  the  new  names  are  created  at  LAMBDA,  LABELS,  and  CATCH 
nodes . 


ENV:  () 

UVARS : ( N ) 

VARS:  (VAR-1) 

BOOY : 

LABELS 

ENV: 


ENV:  ((N  VAR-1)) 

UFNVARS:  (F) 

FNVARS:  (FNVAR-2) 

FNOEFS: 

LAMBDA 

ENV:  ((F  FNVAR-2)  (N  VAR-1)) 

UVARS:  (M  A) 

VARS:  (VAR- 3 VAR-4) 

BODY:  IF 

ENV:  ((A  VAR-4)  (M  VAR-3)  (F  FNVAR-2)  (N  VAR-1)) 

PREO:  COMBINATION 

ENV:  **•  (iee  below) 

ARGS:  VARIABLE 
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BODY: 


ENV:  *** 

VAR:  ■ 

GLOBALP:  T 
VARIABLE 

EHV:  ••• 

VAR:  VAR-} 

GLOBALP:  NIL 
CONSTANT 

ENV:  ••• 

VALUE : t 

CON:  VARIABLE 

ENV:  *** 

VAR:  VAC-4 

GLOBALP:  NIL 
ALT:  COMBINATION 

ENV: 

AR6S:  VARIABLE 

ENV:  ••• 

VAR:  FNVAR-2 

GLOBALP:  NIL 
COMBINATION 

ENV:  ••• 

AR6S:  VARIABLE 

ENV: 
VAR: 
GLOBALP 
VARIABLE 
ENV: 
VAR: 
GLOBALP 
CONSTANT 
ENV: 
VALUE: 

COMBINATION 
ENV:  ••• 

ARGS:  VARIABLE 

ENV: 

VAR: 

GLOBALP: 

VARIABLE 

ENV: 

VAR: 

6LOBALP: 

VARIABLE 

ENV: 

VAR: 

GLOBALP: 

COMBINATION 

ENV:  ( ( F FNVAR-2)  (N  VAR-1)) 

ARGS:  VARIABLE 

ENV:  ((F  FNVAR-2)  (N  VAR-1)) 

VAR:  FNVAR-2 

GLOBALP:  NIL 


•»* 

: T 

Mt 
VAR-} 
: NIL 

••• 

1 


•»» 

: T 

••• 
VAR-} 
: NIL 

VAR-4 

NIL 


?«V 


t-J  f » VM-1' 


6<e**.e  *«• 

COM***' 

twv  H*  • 


i.j  f*w-m 


Tut  r.,«.  is  .»«c  *«  W*  ““ 

(M  W-*>  tr  »f  ’•»“  »>  <• 

, diaarar  It  ^ clWr  ** 

occurs  "here  appears  ip  the  di««r« 

_ t#d  M the  of  fables  bound  at  the  an*  LABELS 

components  are  computed  ®"  the  msis  o _ . . 

rtoer  the  tree  to  VARIABLE  nodes,  "here  It 
nodes  The  E*  information  proxies  dour  tree 

. nee  owe  for  the  one  used  by  the  original  code, 
xs  used  to  supply  the  correct  no*. 

_ -.u..-~  » «• 


72 


VARIABLE 

ENV:  ( (F  FNVAR-2)  (N  VAR-1)) 

VAR:  VAR-1 

6L08ALP : NIL 
CONSTANT 

ENV:  ((F  FNVAR-2)  (N  VAR-1)) 

VALUE : 1 


The  reader  is  asked  to  imagine  that  the  expression 


((A  VAR-4)  (M  VAR-3)  (F  FNVAR-2)  (N  VAR-1)) 


occurs  where  ***  appears  in  the  diagram.  It  should  be  clear  how  the  ENV  ' 

components  are  computed  on  the  basis  of  variables  bound  at  the  LAMBDA  and  LABELS 
nodes.  The  ENV  information  propagates  down  the  tree  to  VARIABLE  nodes,  where  it 
is  used  to  supply  the  correct  new  name  for  the  one  used  by  the  original  code. 

The  first  step  in  the  preliminary  analysis  is  the  determination  of 
referenced  variables: 


LAMBDA 

REFS:  () 

VARS:  (VAR-1) 

BOOY: 

LABELS 

REFS:  (VAR-1) 

FNVARS:  (FNVAR-2) 

FNDEFS: 

LAMBDA 

REFS:  (FNVAR-2) 

VARS:  (VAR-3  VAR-4) 

BOOY:  IF 

REFS:  (FNVAR-2  VAR-3  VAR-4) 

FRED:  COMBINATION 

REFS:  (VAR-3) 

AR6S:  VARIABLE 

REFS:  () 

VAR: 

6L0BALP:  T 
VARIABLE 

REFS:  (VAR-3) 

VAR:  VAR-3 

6L06ALP:  NIL 
CONSTANT 

REFS:  () 


BOOY. 

I 
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VALUE:  • 

CON:  VARIABLE 

REES:  (VAR-4) 

VAR:  VAR -4 

GLOBALP:  NIL 
ALT:  COMBINATION 

REFS:  (FNVAR-2  VAR-3  VAR-4) 

ARGS:  VARIABLE 

REFS:  (FNVAR-2) 

VAR:  FNVAR-2  . 

6L0BALP : NIL 
COMBINATION 

REFS:  (VAR-3) 

ARGS:  VARIABLE 

REFS:  () 

VAR:  - 1 

GLOBALP:  T 
VARIABLE 

REFS:  (VAR- 3) 

VAR:  VAR- 3 

GLOBALP:  NIL 
CONSTANT 

REFS:  () 

VALUE : 1 

COMBINATION 

REFS:  (VAR-3  VAR-4) 

ARGS:  VARIABLE 

REFS:  () 

VAR:  * 

GLOBALP:  T 
VARIABLE 

REFS:  (VAR-3) 

VAR:  VAR-3 

6L0BALP:  NIL 
VARIABLE 

REFS:  (VAR-4) 

VAR:  VAR-4 

6L0BALP:  NIL 

COMBINATION 

REFS:  (FNVAR-2  VAR-1) 

ARGS:  VARIABLE 

REFS:  (FNVAR-2) 

VAR:  FNVAR-2 

GLOBALP:  NIL 
VARIABLE 

REFS:  (VAR-1) 

VAR:  VAR-1 

GLOBALP:  NIL 
CONSTANT 

REFS:  () 

VALUE:  1 


A 
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The  REFS  component  is  a list  of  all  local  variables  referenced  at  or  below  the 
node.  Notice  that,  in  general,  the  REFS  component  of  a node  is  the  union 
(considering  them  as  sets)  of  the  REFS  components  of  its  subnodes.  In  this  way 
the  information  sifts  up  from  the  VARIABLE  nodes.  At  a LAMBDA,  LABELS,  or  CATCH, 
the  variables  bound  at  that  node  are  filtered  out  of  the  REFS  sifting  up.  The 
REFS  for  the  outer  function  must  always  be  (),  a useful  error  check.  In  this 
example,  we  see  that  VAR-1  (N)  is  not  referenced  by  the  function  FNVAR-2  (F). 
This  indicates  that  a closure  for  this  function  need  not  contain  the  value  for 
VAR-1  in  its  environment.  (We  will  not  actually  u£e  the  information  for  this 
purpose,  since  later  analysis  will  determine  that  the  function  need  not  have  a 
closure  constructed  for  it.)  Another  component  ASETVARS  is  computed  for  each 
node,  which  contains  the  set  of  variables  appearing  in  an  ASET'  at  or  below  the 
node.  We  have  omitted  this  information  from  the  diagram  since  the  value  is  the 
empty  set  in  all  cases.  Certain  properties  are  placed  on  the  property  list  of 
each  variable  as  well,  which  are  not  shown  here. 

The  next  pass  locates  trivial  subforms: 


LAMBDA 

TRIVP:  MIL 
VARS:  (VAR-1) 

BOOV : 

LABELS 

TRIVP:  NIL 

FNVARS : (FNVAR-2) 

FNDEFS: 

LAMBOA 

TRIVP:  NIL 

VARS:  (VAR-3  VAR-4) 

BODY:  IF 

TRIVP:  NIL 
PRED:  COMBINATION 

TRIVP:  T 
AR6S:  VARIABLE 

TRIVP:  T 
VAR: 

6L06ALP:  T 
VARIABLE 
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TRIVP:  T 
VAR:  VAR-3 

GLOBALP:  NIL 
CONSTANT 

TRIVP:  T 
VALUE : 0 

CON:  VARIABLE 

TRIVP:  T 
VAR:  VAR-4 

GLOBALP:  NIL 
ALT:  COMBINATION 

TRIVP:  NIL 
ARGS:  VARIABLE 

TRIVP:  T 
VAR:  FNVAR-2 

GLOBALP:  NIL 
COMBINATION 
TRIVP:  T 
ARGS:  VARIABLE 

TRIVP:  T 
VAR: 

GLOBALP:  T 
VARIABLE 

TRIVP:  T 
VAR:  VAR-3 

GLOBALP:  NIL 
CONSTANT 

TRIVP:  T 
VALUE:  1 

COMBINATION 
TRIVP:  T 
ARGS:  VARIABLE 

TRIVP:  T 
VAR:  * 

GLOBALP:  T 
VARIABLE 

TRIVP:  T 
VAR:  VAR -3 

GLOBALP:  NIL 
VARIABLE 

TRIVP:  T 
VAR:  VAR-4 

GLOBALP:  NIL 

BODY:  COMBINATION 

TRIVP:  NIL 
ARGS:  VARIABLE 

TRIVP:  T 
VAR:  FNVAR-2 

GLOBALP:  NIL 
VARIABLE 

TRIVP:  T 
VAR:  VAR-1 

GLOBALP:  NIL 
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CONSTANT 

TRIVP:  T 
VALUE : 1 

Constants  and  variables  are  always  trivial,  and  trivial  combinations  (involving 
only  MacLISP  primitives)  are  located.  As  before,  in  this  pass  information  sifts 
up  from  below.  One  possibility  not  yet  explored  in  RABBIT  is  to  isolate  entire 
SCHEME  functions  (for  example  FNVAR-Z),  determine  that  it  is,  as  a whole, 
trivial,  compile  it  as  a simple  MacLISP  SUBR,  and  reference  it  as  a primitive. 
This  would  in  turn  render  trivial  the  combination  (F  N 1)  in  the  body  of  the 
LABELS,  for  example. 

The  analysis  of  side-effects  merely  determines  that  no  side-effects  are 
present,  and  is  uninteresting  for  our  example.  The  optimization  pass  finds  no 
transformations  worth  making.  We  will  skip  over  these  steps  to  the  conversion  to 
continuation-passing  style.  As  a simple  S-expression,  this  may  be  rendered  as: 

(LAMBDA  (CONT-5  VAR-1) 

(LABELS  ( ( FNVAR-Z 

(LAMBDA  (CONT-6  VAR-3  VAR-4) 

(IF  (=  VAR-3  0) 

(CONT-6  VAR-4) 

(FNVAR-Z  CONT-6 

| . VAR-T  1 1 

(a  VAR-3  VAR-4)))))) 

(FNVAR-Z  CONT-5  VAR-1  1))) 

In  rendering  this  as  a tree  of  data  structures,  we  use  structures  of  type  CLAMBDA 
instead  of  LAMBDA,  etc.,  in  order  to  prevent  confusion.  Trivial  forms  are 
represented  by  structures  of  type  TRIVIAL  with  pointers  to  the  data  structures 
from  before.  We  will  not  notate  such  data  structures  in  the  following  diagrams, 
but  will  simply  write  an  S-expression  as  a reminder  of  what  the  trivial  form  was. 
The  types  RETURN  and  CONTINUATION  are  like  CCOMBINATION  and  CLAMBDA,  but  are 
distinguished  as  discussed  above  for  convenience  and  for  purposes  of  consistency 
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checking . 


CLAMBOA 

VARS: 

BODY: 


(CONT-5  VAR-1) 

CLABELS 

FNVARS : ( FNVAR-2 ) 

FNOEFS:  CLAMBOA 

VARS:  (CONT-6  VAR-3  VAR-4) 

BOOY:  CIF 

PREO:  TRIVIAL 

(•  VAR-3  0) 

CON:  RETURN 

CONT:  CVARIABLE 

VAR:  CONT-6 

VAL:  TRIVIAL 

VAR-4 

ALT:  CCOMBINAT ION 

AR6S:  TRIVIAL 

FNVAR-2 

CVARIABLE 

VAR:  CONT-6 

TRIVIAL 

(-  VAR-J  i) 
TRIVIAL 

(*  VAR-3  VAR-4) 

BOOY:  CCOMBINAT ION 

ARGS:  TRIVIAL 

FNVAR-2 

CVARIABLE 


VAR:  CONT-5 

TRIVIAL 
VAR-1 
TRIVIAL 
1 


The  first  post-conversion  analysis  pass  computes  ENV  and  REFS  components 
as  before,  this  time  including  the  variables  introduced  to  represent 
continuations.  The  ENV  in  this  case  is  not  an  a-list,  but  simply  a list  of 
variables,  since  no  renaming  is  taking  place.  The  ENV  information  sifts  down 
from  above  during  the  tree  walk,  and  on  the  way  back  the  REFS  information  sifts 
up.  For  a TRIVIAL  node,  the  REFS  information  is  taken  from  the  pre-conversion 
node  referenced  by  the  TRIVIAL  node;  this  REFS  information  is  shown  here  as  a 
reminder.  As  before,  the  REFS  information  for  a node  is  always  a subset  of  the 
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ENV  information. 


CLAMBOA 

ENV:  () 

REFS:  () 

VARS:  (CONT-5  VAR-1) 

BODY:  CLABELS 

ENV:  (CONT-5  VAR-1) 

REFS:  (CONT-5  VAR-1) 

FNVARS:  (FNVAR-2) 

FNDEFS:  CLAMBDA 

ENV:  (FNVAR-2  CONT-5  VAR-1) 

REFS:  (FNVAR-2) 

VARS:  (CONT-6  VAR-3  VAR-4 ) 

BODY:  CIF 

ENV:  *** 

REFS:  (FNVAR-2  CONT-6  VAR-3  VAR-4) 

PRED:  TRIVIAL 

REFS:  (VAR-3) 

(»  VAR-3  0) 

CON:  RETURN 

ENV:  *** 

REFS:  (CONT-6  VAR-4) 

CONT:  CVAR1ABLE 

ENV:  «•» 

REFS:  (CONT-6) 

VAR:  CONT-6 

VAL:  TRIVIAL 

REFS:  (VAR-4) 

VAR-4 

ALT:  COMBINATION 

ENV:  *** 

REFS:  (FNVAR-2  CONT-6  VAR-3  VAR-4) 

ARGS:  TRIVIAL 

REFS:  (FNVAR-2) 

FNVAR-2 

CVARIABLE 

ENV: 

REFS:  (CONT-6) 

VAR:  CONT-6 

TRIVIAL 

REFS:  (VAR-3) 

(-  VAR-3  1) 

TRIVIAL 

REFS:  (VAR-3  VAR-4) 

(*  VAR-3  VAR-4) 

BODY:  COMBINATION 

ENV:  (FNVAR-2  CONT-5  VAR-1) 

REFS:  (FNVAR-2  CONT-5  VAR-1) 

ARGS:  TRIVIAL 

REFS:  (FNVAR-2) 

FNVAR-2 
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CVARIABLE 

ENV:  (FNVAR-2  CONT-S  VAR-1) 

REFS:  (CONT-S) 

VAR:  COMT-S 

TRIVIAL 

REFS:  (VAR-1) 

VAR-1 

TRIVIAL 

REFS:  () 

1 

The  reader  is  asked  to  imagine  that  where  ***  occurs  the  expression 
(CONT-6  VAR-3  VAR-4  FNVAR-2  CONT-5  VAR-1) 

had  been  written  instead.  An  additional  operation  performed  on  this  pass  is  to 
flag  all  variables  referenced  in  other  than  function  position.  These  include 
VAR-1,  VAR-3,  etc.;  but  FNVAR-2  is  not  among  them.  This  will  be  of  importance 
below. 

The  next  pass  determines  all  variables  referenced  by  closures  at  or  below 
each  node,  and  also  decides  which  functions  will  actually  be  closed.  It  is 
determined  that  FNVAR-2  need  not  be  closed,  because  it  is  referred  to  only  in 
function  position  (as  determined  by  the  previous  pass),  and  is  not  referred  to  by 
any  other  closures.  As  a result,  no  closures  are  created  at  all  in  this 
function,  and  so  all  the  computed  sets  of  variables  are  empty.  This  pass  also 
assigns  the  name  F-7  to  the  outer  function,  for  use  later  as  a tag. 

The  third  pass  computes  the  "depth”  of  each  function,  which  determines 
’*>  >>i  jn  -bat  registers  or  other  locations  arguments  will  be  passed  for  each 
'»>  In  this  case  the  outer  CLAHBDA  is  assigned  depth  0,  and  the  one 
• • • •.ap  it  assigned  depth  2,  because  it  is  not  closed,  and  is  contained 

■ ■ > nf  l arguments.  In  this  way  registers  are  allocated  in  a 
»-•  .a#  mm>«>  all  closed  functions  are  of  depth  0,  and  all  unclosed 
• • r feet  *1  the  containing  function  and  Its  number 
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environments  for  closed  functions.  Since  there  are  none  in  this  example,  this 
analysis  is  of  little  interest  here. 

Finally,  we  are  ready  to  generate  code.  Consider  the  S-expression  form: 

(LAMBDA  (CONT-5  VAR-1) 

(LABELS  ((FNVAR-Z 

(LAMBDA  (CONT-6  VAR-3  VAR-4) 

(IF  (*  VAR-3  0) 

(CONT-6  VAR-4) 

(FNVAR-Z  CONT-6 

(-  VAR-3  1) 

(*  VAR-3  VAR-4)))))) 

(FNVAR-Z  CONT-5  VAR-1  1))) 

The  first  function  encountered  is  the  outer  one  (named  F-7).  In  analyzing  its 
body  we  note  the  LABELS,  and  place  all  the  labelled  functions  (that  is,  FNVAR-Z) 
on  the  queue  of  functions  yet  to  be  processed.  We  then  analyze  the  body  of  the 
LABELS.  This  is  a combination,  and  so  we  analyze  each  argument,  producing  code 
for  each.  Each  argument  must  be  TRIVIAL,  a (C)VARIABLE,  or  a (C)LAMBDA- 
expression.  (We  shall  refer  to  this  set  of  possibilities  as  "meta-trivial" , 
which  means  what  "trivial"  did  in  [Imperative].)  The  variable  FNVAR-Z  refers  to 
a known  function  which  is  not  closed,  and  so  we  need  not  set  up  **FUN**.  The 
others  may  be  referred  to  as  **CONT**,  **ONE**,  and  the  constant  1,  respectively. 
These  are  to  be  passed  to  FNVAR-Z  through  the  registers  **TWO**,  **THREE**,  and 
**FOUR**  (as  determined  by  the  register  allocation  pass).  Thus  the  code  for  F-7 
looks  like  this: 


F-7 


((LAMBDA  (Q-40  Q-41  Q-4Z) 

(SETQ  **FOUR**  Q-4Z) 
(SETQ  **THREE**  Q-41) 
(SETQ  **TWO**  Q-40)) 
**CONT**  **ONE**  ’1) 


(GO  FNVAR-Z) 


The  first  form  sets  up  the  arguments,  using  a standard  "simultaneous  assignment" 
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of  arguments. 

One  way  to  think  about  this  trick,  is  as  follows.  A closure  consists  of  a 
pointer  to  a piece  of  code  and  a set  of  values  determined  at  the  time  of  closure. 
When  the  closure  is  invoked,  we  execute  the  code,  making  available  to  it  (a)  the 
set  of  values  (its  environment),  and  (b)  some  additional  arguments.  Slicing 
these  components  a different  way,  we  may  think  of  calling  the  bare  code, 
supplying  all  the  values  as  arguments;  we  pass  the  arguments  in  some  registers, 
and  the  environment  values  in  some  other  registers.  Put  yet  another  way,  if  we 
can  determine  that  every  caller  of  the  closed  function  can  reconstruct  the 
necessary  environment  at  the  time  of  the  call  (because  it  will  have  available  the 
necessary  values  anyway),  then  we  can  avoid  constructing  the  closure  at  the  point 
where  the  function  should  be  closed,  and  instead  arrange  for  each  caller  to  pass 
the  environment  through  specified  registers.  As  mentioned  earlier,  the  compiler 
has  a completely  free  hand  in  determining  the  format  of  an  environment! 

As  it  happens,  the  function  labelled  FNVAR-2  does  not  reference  CONT-5  or 
VAR-1,  and  so  this  argument  is  of  no  importance  here.  It  is  determined  that  the 
following  register  assignments  will  apply: 


CONT-5 

**CONT** 

VAR-1 

**0NE** 

FNVAR-2 

<none> 

CONT-6 

**TW0** 

VAR -3 

**THREE** 

VAR -4 

**FOUR** 

(Note  Continuation  Variable  Hack)  We  will  see  below  that  some  unnecessary 
shuffling  of  values  results;  a more  complicated  register  assignment  technique 
would  be  useful  here.  (One  was  outlined  in  [Declarative],  but  it  has  not  been 
implemented.  See  also  [Wulf]  and  [Johnsson].) 

The  fourth  post-conversion  analysis  pass  determines  the  format  of 
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construction.  The  second  branches  to  the  code  for  FNVAR-2.  Because  a known 
function  is  being  called,  it  is  not  necessary  to  set  up  **NARGS**.  Because 
FNVAR-2  requires  no  closure,  it  is  not  necessary  to  set  up  **ENV**. 

The  next  function  on  the  queue  to  process  Is  FNVAR-2.  Its  body  is  an  IF 
(actually  a CIF);  this  is  compiled  into  a COND  containing  the  code  for  the 
predicate,  consequent,  and  alternative: 

(COND  (<predicate>  <consequent>) 

(T  <alternative>) ) 

The  predicate  is  guaranteed  to  be  meta-trivial.  It  is,  in  this  example,  a 
trivial  combination;  this  is  compiled  by  changing  all  the  variable  references 
appropriately,  producing  (=  **THREE**  *0). 

The  consequent  involves  calling  an  unknown  continuation  which  is  in 
**TWO**.  The  returned  value  is  in  **F0UR**.  The  code  produced  is: 

(SETQ  **FUN**  **TW0**) 

( SETQ  **0NE**  **F0UR**) 

(RETURN  NIL) 

The  (RETURN  NIL)  exits  the  module,  passing  control  to  the  dispatcher  in  the 
SCHEME  interpreter,  which  will  arrange  to  invoke  the  continuation. 

The  code  for  the  alternative  is  similar  to  that  for  the  body  of  F-7, 
because  we  are  calling  the  known  function  FNVAR-2.  The  generated  code  is: 

((LAMBDA  ( Q-43  Q-44) 

(SETQ  **F0UR**  Q-44) 

(SETQ  **THREE**  Q-43)) 

(-  **THREE**  *1) 

(*  **THREE**  **FOUR**)) 

(GO  FNVAR-2) 

The  argument  set-up  ought  to  involve  copying  **TW0**  into  **TW0**,  but  a peephole 
optimization  eliminates  that  SETQ. 
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Putting  all  this  together,  the  code  for  FNVAR-2  is: 


FNVAR-2  (COND  ((=  **THREE**  ’0) 

(SETQ  **FUN**  **TUO**) 

(SETQ  **ONE**  **FOUR**) 

(RETURN  NIL)) 

(T  ((LAMBDA  (Q-43  Q-44) 

(SETQ  **FOUR**  Q-44) 
(SETQ  **THREE**  Q-43)) 
(-  **THREE**  *1) 

(*  **THREE**  **FOUR**) ) 

(GO  FNVAR-2))) 


(We  have  glossed  over  the  peephole  optimizations  which  eliminate  occurrences  of 
PROGN  in  such  places  as  COND  clauses.) 

There  are  no  more  functions  to  be  processed,  and  so  we  now  create  the 
final  module.  The  final  output,  with  comments  inserted  by  RABBIT  for  debugging 
purposes,  and  declarations  supplied  by  RABBIT  for  the  benefit  of  the  MacLISP 
compiler,  looks  like  this: 


(PROGN  ’COMPILE 

(COMMENT  MODULE  FOR  FUNCTION  IFACT) 

( DEFUN  7-37  ( ) 

( PROG  ( ) 

(DECLARE  (SPECIAL  7-37)) 

(GO  ( PR062  NIL  (CAR  **ENV**)  (SETQ  **ENV«*  (COR  **ENV«*)))) 
P-7  (COMMENT  (DEPTH  * 0)  (FNP  • NIL)  (VARS  - (CONT-5  N))) 
((LAMBDA  (Q-40  Q-41  Q-42) 

(SETQ  **FOUR**  Q-42) 

(SETQ  **THREE**  Q-41) 

(SETQ  **TWO**  Q-4# ) ) 

**CONT**  **ONE*»  ’1) 

(COMMENT  (DEPTH  - 2)  (FNP  - NOCLOSE)  (VARS  • (CONT-6  M A))) 
(GO  FNVAR-2) 

FNVAR-2 

(COMMENT  (DEPTH  • 2)  (FNP  • NOCLOSE)  (VARS  - (CONT-B  HA))) 
(COND  ((*  **THREE»*  ’0) 

(SETQ  **FUN»*  **TVO*«) 

(SETQ  »»ONE*»  *»FOUR**) 

(RETURN  NIL)) 

(T  ((LAMBDA  (Q-43  Q-44) 

(SETQ  **FOUR»*  Q-44) 

(SETQ  ** THREE**  Q-43)) 

(-  *»THREE**  ’1) 

(*  **THREE**  **FOUR**)) 


(COMMENT  (OEPTH  « 2)  (FNP  - NOCLOSE)  (VARS  - (CONT-6  M A))) 
(GO  FNVAR-2 ) ) ) ) ) 

(SETO  7-37  (GET  '7-37  'SUBR)) 

(SETQ  I FACT  (LIST  'CBETA  7-37  1 F - 7 ) ) 

(OEFPROP  7-37  I FACT  USER-FUNCTION)) 


In  the  interpolated  comments,  FNP  refers  to  whether  the  function  being  entered  or 
be  .ng  called  is  closed  or  not  (the  possibilities  are  NIL,  NOCLOSE,  and  EZCLOSE). 
The  VARS  are  the  passed  variables,  expressed  as  the  names  from  the  original 
source  code,  except  for  those  introduced  by  the  CPS  conversion.  The  form  (SETQ 
I FACT  — ) constructs  the  closure  for  the  globally  defined  function  IFACT.  The 
DEFPROP  form  provides  debugging  information. 

The  points  of  interest  in  this  example  are  the  isolation  of  trivial 
subforms,  and  the  analysis  of  the  function  FNVAR-2  which  allows  it  to  be  called 
with  GO.  Examination  of  the  output  code  will  show  that  FNVAR-2  is  coded  as  an 
iterative  loop.  While  the  register  allocation  leaves  something  to  be  desired, 
the  inner  loop  does  surprisingly  little  shuffling.  (This  should  be  compared  with 
the  code  suggested  in  [Declarative]  for  this  function.) 

For  those  who  prefer  "real”  machine  language,  we  give  a plausible 
transcription  of  the  MacLISP  code  into  our  hypothetical  machine  language: 


IFACT:  PUSH  CONT  ;CONT  contains  the  return  address 

PUSH  ONE 
PUSH  1 
POP  FOUR 
POP  THREE 
POP  TWO 
GOTO  FNVAR2 
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FNVARZ : JUMP-IF-ZERO  THREE, FNV2A 
MOVE  ONE, FOUR 

RETURN  (TWO)  ; return  to  address  in  TWO 

FNVZA:  MOVE  TEMP, THREE  ;TEHP  is  used  to  evaluate 

ADD  TEMP, 1 ; trivial  forms 

PUSH  TEMP 
MOVE  TEMP, THREE 
MUL  TEMP, FOUR 
PUSH  TEMP 
POP  FOUR 
POP  THREE 
GOTO  FNVAR2 


WhiJe  this  is  not  the  world's  most  impressively  tight  code,  it  again  shows  the 
essential  iterative  structure  of  the  inner  loop.  The  primary  problem  is  the 
absence  of  analysis  of  which  registers  are  used  when.  Leaving  aside  the  question 
of  allocating  registers,  one  could  at  least  determine  when  assigning  values  to 
registers  for  argument  set-up  can  occur  sequentially  rather  than  simultaneously. 

There  are  a few  other  obvious  optimizations  which  have  not  been 
performed,  for  example  the  elimination  of  (GO  FNVAR-2)  just  before  the  tag  FNVAR- 
2.  While  this  would  not  have  been  difficult,  we  knew  that  the  NacLISP  compiler 
would  take  care  of  this  for  us;  since  it  is  not  a very  interesting  issue,  we  let 
it  slide. 
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10.  Performance  Measurements 

RABBIT  has  provision  for  metering  runtime  usage,  and  for  controlling 
whether  certain  options  in  the  optimizer  are  used.  The  standard  test  case  has 
been  RABBIT  compiling  itself  (!);  by  running  both  interpreted  and  compiled 
version  of  this  task,  some  comparisons  have  been  made.  Two  different  compiled 
versions  have  also  been  tested,  where  the  code  was  produced  with  or  without  using 
the  optimizer. 

The  overall  speed  gain  of  unoptimized  compiled  code  over  interpreted  code 
has  been  measured  to  be  a factor  of  25.  The  speed  gain  ratio  excluding  time  for 
garbage  collection  was  17,  and  the  garbage  collection  time  ratio  was  34.  (The 
SCHEME  interpreter  does  a lot  of  consing.  The  straight  runtime  ratio  of  17  is 
roughly  typical  for  standard  LISP  compilers  on  non-numeric  code.) 

The  overall  speed  ratio  of  optimized  compiled  code  to  unoptimized 
compiled  code  has  been  measured  to  be  1.2.  The  speed  ratio  excluding  garbage 
collection  was  1.37,  and  the  garbage  collection  time  ratio  was  1.07.  We  conclude 
that  the  amount  of  consing  was  reduced  very  little,  despite  optimizations  which 
may  eliminate  closures,  because  the  phase-2  analysis  of  closures  eliminated  most 
consing  from  that  source  anyway.  Eliminations  of  register  shuffling  because  of 
substitutions  of  one  variable  for  another  were  probably  more  significant. 

Combining  these  figures  yields  an  overall  speed  ratio  for  optimized 
compiled  code  over  interpreted  code  of  about  30. 

Turning  now  to  the  analysis  of  compilation  time,  as  opposed  to  running 
time,  we  have  found  that  using  the  optimizer  approximately  doubles  the  cost  of 
compilation.  It  might  be  possible  to  reduce  this  with  a more  clever  optimizer; 
presently  RABBIT  wastes  much  time  re-doing  certain  analysis  unnecessarily.  The 
extra  time  needed  by  the  optimizer  excluding  garbage  collection  is  only  half 
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again  the  overall  compilation  time,  but  ihe  garbage  collection  time  triples, 
because  the  optimizer  copies  and  re-copies  p?rts  of  the  program. 

There  is  also  one  error  check  which  is  very  expensive;  it  checks  every 
argument  of  a combination  against  every  other  argument  to  check  for  possible 
side-effect  conflicts  (this  is  the  "liberal"  analysis  in  EFFS-ANALYZE,  and  the 
testing  done  by  CHECK-CONBINATION-PEFFS) . Use  of  this  error  check  increased 
compilation  time  by  thirty  percent. 
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11.  Comparison  with  Other  Work 

The  only  other  work  we  know  of  similar  to  ours  is  that  in 
[Wand  and  Friedman].  They  use  a technique  from  category  theory  known  as 
factorization  to  isolate  trivial  expressions.  As  far  as  they  go,  their  work  is 
similar  to  ours;  they  have  written  a compiler  for  LISP  code,  producing  output 
code  which  uses  continuations.  However,  they  indicate  that  they  cannot  interface 
compiled  and  interpreted  code  correctly.  Moreover,  while  they  use  continuations, 
they  do  not  make  general  use  of  closures,  and  in  fact  there  is  no  clue  that 
closures  are  permitted  in  their  source  language,  or  that  functions  are 
permissible  as  data  objects.  (In  fact,  there  is  evidence  to  the  contrary  in 
several  examples  they  give  involving  an  expression 

( MAPCAR  (QUOTE  (LAMBDA  ...))  ...) 

These  seem  to  indicate  that  they  have  not  made  the  crucial  distinction  between 
treating  a function  as  a data  object  and  treating  a representation  of  a function 
as  data.)  Wand  and  Friedman  do  realize  the  importance  of  tail-recursion,  but 
fail  to  mention  the  necessity  for  lexical  scoping  (perhaps  taking  it  for 
granted).  We  feel  that  the  contributions  of  category  theory  may  provide 
interesting  new  ways  to  analyze  programs,  but  also  feel  that  Wand  and  Friedman 
have  not,  in  the  work  cited,  explored  it  thoroughly,  since  they  have  not  even 
explored  the  issue  of  closures  as  such. 

Somewhat  more  distantly  related  is  the  work  of  Carter  and  others  at  the 
IBM  T.J.  Watson  Research  Lab.  [Carter]  This  work  is  similar  in  spirit,  in  that 
it  uses  "macro  definitions"  of  complex  operators,  which  are  integrated  into  the 
program  being  compiled,  followed  by  source-to-source  program  transformations 
which  optimize  the  resulting  code.  However,  they  have  primarily  worked  with 
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definitions  of  complex  data  manipulations,  such  as  string  concatenation,  whereas 
this  report  has  dealt  exclusively  with  environment  and  control  operations. 
(Also,  as  a matter  of  taste,  we  find  SCHEME  a simpler  and  more  tractable  language 
to  deal  with  than  the  low-level  dialect  of  PL/I  used  in  [Carter],  partly  because 
of  its  closeness  to  lambda-calculus  and  partly  because  SCHEME  inherits  from  LISP 
the  natural  ability  to  deal  with  representations  of  its  own  programs.) 
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12.  Conclusions  and  Future  Work 


Lexical  scoping,  tail-recursion,  the  conceptual  treatment  of  functions 
(as  opposed  to  representations  thereof)  as  data  objects,  and  the  ability  to 
notate  "anonymous"  functions  make  SCHEME  an  excellent  language  in  which  to 
express  program  transformations  and  optimizations.  Imperative  constructs  are 
easily  modelled  by  applicative  definitions.  Anonymous  functions  make  it  easy  to 
avoid  needless  duplication  of  code  and  conflict  of  variable  names.  A language 
with  these  properties  is  useful  not  only  at  the  preliminary  optimization  level, 
but  for  expressing  the  results  of  decisions  about  order  of  evaluation  and  storage 
of  temporary  quantities.  These  properties  make  SCHEME  as  good  a candidate  as  any 
for  an  UNCOL.  The  proper  treatment  of  functions  and  function  calls  leads  to 
generation  of  excellent  imperative  low-level  code. 

We  have  emphasized  the  ability  to  treat  functions  as  data  objects.  We 
should  point  out  that  one  might  want  to  have  a very  simple  run-time  environment 
which  did  not  support  complex  environment  structures,  or  even  stacks.  Such  an 
end  environment  does  not  preclude  the  use  of  the  techniques  described  here.  Many 
optimizations  result  in  the  elimination  of  LAMBDA-expressions;  post  CPS- 
conversion  analysis  eliminates  the  need  to  close  many  of  the  remaining  LAMBDA- 
expressions.  One  could  use  the  macros  and  internal  representations  of  RABBIT  to 
describe  intermediate  code  transformations,  and  require  that  the  final  code  not 
actually  create  any  closures.  As  a concrete  example,  imagine  writing  an 
operating  system  in  SCHEME,  with  machine  words  as  the  data  domain  (and  functions 
excluded  from  the  run-time  data  domain).  We  could  still  meaningfully  write,  for 
example : 
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(IF  (OR  (STOPPED  (PROCESS  I)) 

( AWAITING- IMPUT  (PROCESS  I))) 

(SCHEDULE-LOOP  (+  I 1)) 

(SCHEDULE-PROCESS  I)) 

While  the  intermediate  expansion  of  this  code  would  conceptually  involve  the  use 
of  functions  as  data  objects,  optimizations  would  reduce  the  final  code  to  a form 
which  did  not  require  closures  at  run  time. 

An  experiment  we  would  like  to  try  would  be  to  use  CGOL  [Pratt],  a 
program  which  parses  ALGOL-like  syntax  and  produces  LISP  code,  as  a front  end  for 
RABBIT.  The  result  would  be  a compiler  for  an  ALGOL-like  language  which  would 
produce  code  by  the  processes  of  parsing  (by  CGOL);  macro-expansion, 
optimization,  and  output  of  MacLISP  code  (by  RABBIT);  and  generation  of  PDP-10 
machine  language  (by  the  MacLISP  compiler). 

Among  the  interesting  issues  we  have  not  dealt  with  or  have  not  yet 
implemented  in  RABBIT  are:  compilation  of  data  manipulation  primitives, 
interaction  of  such  primitives,  procedure  integration  of  the  most  general  form, 
and  complex  register  allocation.  A particularly  interesting  issue  is  that  of 
data  type  analysis.  Such  analysis  would  solve  certain  problems  which  cannot 
easily  be  solved  now  by  RABBIT.  For  example,  consider  the  piece  of  code: 

(IF  (OR  A B)  X Y) 

The  macro-expansion  and  optimization  phases  will  reduce  this  to: 

(IF  A (IF  A X Y)  (IF  B X Y) ) 

The  difficulty  is  that  RABBIT  has  no  way  of  knowing  that  A is  known  to  be  non- 
null in  the  first  inner  IF  by  virtue  of  the  testing  of  A in  the  outer  IF.  If  it 
could  realize  this,  then  the  code  would  reduce  to  the  more  reasonable: 
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(IF  A X (IF  B X Y)) 

Compare  this  with  the  case  of  (IF  (AND  ...)  ) presented  earlier. 

One  particularly  nagging  difficulty  concerns  an  interaction  between  CATCH 
and  optimization  by  substituting  expressions  for  variables.  The  problem  is  that 
if  an  expression  with  a side-effect  is  substituted  into  a place  which  is 
evaluated  after  the  return  of  a call  to  an  unknown  function  (where  it  had  been 
written  at  a place  normally  evaluated  before  the  call),  and  if  a CATCH  is 
performed  within  that  unknown  function,  and  the  escape  function  is  subsequently 
called  more  than  once,  then  the  expression  with  a side-effect  will  be  evaluated 
twice  instead  of  once.  There  is  no  possible  way  to  decide  whether  this  can 
happen,  other  than  to  be  fearful  of  all  unknown  function  calls.  In  practice  this 
defeats  most  optimization.  We  have  ignored  this  difficulty  in  RABBIT.  It 
probably  indicates  that  escape  functions  are  even  more  intractable  than  we  had 
earlier  believed.  It  would  not  be  so  bad  if  we  could  insist  that  an  escape 
function  be  called  no  more  than  once  (or  rather,  that  a CATCH  be  returned  from  no 
more  than  once,  implying  that  if  the  escape  function  is  used  it  must  be 
dynamically  within  the  body  of  the  CATCH).  If  this  restriction  is  enforced,  or 
if  CATCH  is  forbidden,  then  in  fact  no  continuation  can  be  invoked  more  than 
once,  which,  with  other  suitable  restrictions,  accounts  for  the  ability  of  most 
languages  to  use  stacks  instead  of  trees  for  their  control  stacks. 
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Notes 

{Note  ASET'  Is  Imperative} 

It  is  true  that  ASET'  is  an  actual  imperative  which  produces  a side 
effect,  and  is  not  expressed  applicatively.  ASET*  is  used  only  for  two  purposes 
in  practice:  to  initialize  global  variables  (often  relating  to  MacLISP 
primitives),  and  to  implement  objects  with  state  (cells,  in  the  PLASNA  sense 
[Smith  and  Hewitt]  [Hewitt  and  Smith]).  If  we  were  to  redesign  SCHEME  from 
scratch,  I imagine  that  we  would  introduce  cells  as  our  primitive  side-effect 
rather  than  ASET'.  The  decision  to  use  ASET'  was  motivated  primarily  by  the 
desire  to  interface  easily  to  the  MacLISP  environment  (and,  as  a corollary,  to  be 
able  to  implement  SCHEME  in  three  days  instead  of  three  years!). 

We  note  that  in  approximately  one  hundred  pages  of  SCHEME  code  written  by 
three  people,  the  non-quoted  ASET  has  never  been  used,  and  ASET'  has  been  used 
only  a dozen  times  or  so,  always  for  one  of  the  two  purposes  mentioned  above.  In 
most  situations  where  one  would  like  to  write  an  assignment  of  some  kind,  macros 
which  expand  into  applicative  constructions  suffice. 


sr  -» 
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{Note  Code  Pointers) 

Conceptually  a closure  is  made  up  of  a pointer  to  some  code  (a  "script" 
[Smith  and  Hewitt])  and  an  environment.  In  a RABBIT-formatted  CBETA,  the  pointer 
to  the  code  is  encoded  into  two  levels:  a pointer  to  a particular  piece  of 
MacLISP  code,  plus  a tag  within  that  PROG.  This  implementation  was  forced  upon 
us  by  MacLISP.  If  we  could  easily  create  pointers  into  the  middle  of  a PROG,  we 
could  avoid  this  two-level  encoding. 

On  the  other  hand,  this  is  not  just  an  engineering  kludge,  but  can  be 
provided  with  a reasonable  semantic  explanation:  rather  than  compiling  a lot  of 
little  functions,  we  compile  a single  big  function  which  is  a giant  CASE 
statement.  Wherever  we  wish  to  make  a closure  of  a little  function,  we  actually 
close  a different  little  function  which  calls  the  big  function  with  an  extra 
argument  to  dispatch  on. 


(Note  Continuation  Variable  Hack) 

Since  the  dissertation  was  written,  a simple  modification  to  the  routine 
which  converts  to  continuation-passing  style  has  eliminated  some  of  the  register 
shuffling.  The  effect  of  the  change  was  to  perform  substitutions  of  one 
continuation  variable  for  another,  in  situations  such  as: 

((CLAMBDA  (CONT-3  ...)  ...) 

CONT-Z  ...) 

where  CONT-2  would  be  substituted  for  CONT-3  in  the  body  of  the  CLAMBDA- 
expression.  Once  this  is  done,  CONT-3  is  unreferenced,  and  so  is  not  really 
passed  at  all  by  virtue  of  the  phase-2  analysis.  The  result  is  that 
continuations  are  not  copied  back  and  forth  from  register  to  register.  In  the 
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iterative  factorial  example  in  the  text,  the  actual  register  assignment  would  be: 


**CONT** 

**ONE** 

<none> 

♦♦TWO** 

♦♦THREE** 


This  optimization  is  discussed  more  thoroughly  in  the  Appendix  near  the  routine 
CONVERT- COMB I NATION . 


CONT-5 
VAR-1 
FNVAR-2 
VAR -3 
VAR-4 


(Note  Dijkstra's  Opinion} 

In  [Dijkstra]  a remark  is  made  to  the  effect  that  defining  the  while-do 
construct  in  terms  of  function  calls  seems  unusually  clumsy.  In  [Steele]  we 
reply  that  this  is  due  partly  to  Dijkstra's  choice  of  ALGOL  for  expressing  the 
definition.  Here  we  would  add  that,  while  such  a definition  is  completely 
workable  and  is  useful  for  compilation  purposes,  we  need  never  tell  the  user  that 
we  defined  while-do  in  this  manner!  Only  the  writer  of  the  macros  needs  to  know 
the  complexity  involved;  the  user  need  not,  and  should  not,  care  as  long  as  the 
construction  works  when  he  uses  it. 
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(Note  Evaluation  for  Control} 

It  is  usual  in  a compiler  to  distinguish  at  least  three  "evaluation 
contexts":  value,  control,  and  effect.  (See  [Wulf],  for  example.)  Evaluation 
for  control  occurs  in  the  predicate  of  an  IF,  where  the  point  is  not  so  much  to 
produce  a data  object  as  simply  to  decide  whether  it  is  true  or  false.  The 
results  of  AND,  OR,  and  NOT  operations  in  predicates  are  "encoded  in  the  program 
counter".  When  compiling  an  AND,  OR,  or  NOT,  a flag  is  passed  down  indicating 
whether  it  is  for  value  or  for  control;  in  the  latter  case,  two  tags  are  also 
passed  down,  indicating  the  branch  targets  for  success  or  failure.  (This  is 
called  "anchor  pointing"  in  [Allen  and  Cocke].) 

In  RABBIT  this  notion  falls  out  automatically  without  any  special 
handling,  thanks  to  the  definition  of  AND  and  OR  as  macros  expanding  into  IF 
statements.  If  we  were  also  to  define  NOT  as  a macro 

(NOT  x)  =>  (IF  x ’NIL  'T) 

then  nearly  all  such  special  "evaluation  for  control”  cases  would  be  handled  by 
virtue  of  the  nested- IF  transformation  in  the  optimizer. 

One  transformation  which  ought  to  be  in  the  optimizer  is 

(IF  ((LAMBDA  (X  Y ...)  <body>)  A B ...)  <con>  <alt>) 

= > ((LAMBDA  (X  Y ...)  (IF  <body>  <con>  <alt»)  A B ...) 

which  could  be  important  if  the  <body>  is  itself  as  IF.  (This  transformation 
would  occur  at  a point  (in  the  optimizer)  where  no  conflicts  between  X,  Y,  ... 
and  variables  used  in  <con>  and  <alt>  could  occur.) 
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{Note  Evaluation  for  Effect} 

This  is  the  point  where  the  notion  of  evaluation  for  effect  is  handled 
(see  {Note  Evaluation  for  Control}).  It  is  detected  as  the  special  case  of 
evaluation  for  value  where  no  one  refers  to  the  value!  This  nay  be  construed  as 
the  distinction  between  "statement"  and  "expression"  made  in  Algol-like 
languages. 

{Note  Full-Funarg  Example} 

As  an  example  of  the  difference  between  lexical  and  dynamic  scoping, 
consider  the  classic  case  of  the  "funarg  problem".  We  have  defined  a function 
HAPCAR  which,  given  a function  and  a list,  produces  a new  list  of  the  results  of 
the  function  applied  to  each  element  of  the  given  list: 

(DEFINE  MAPCAR 

( LAMBDA  (FN  L) 

(IF  (NULL  L)  NIL 

(CONS  (FN  (CAR  L))  (HAPCAR  FN  (CDR  L)))))) 

Now  suppose  in  another  program  we  have  a list  X and  a number  L,  and  want  to  add  L 
to  every  element  of  X: 

( MAPCAR  ( LAMBDA  (Z)  (♦  Z L))  X) 

This  works  correctly  in  a lexically  scoped  language  such  as  SCHEME,  because  the  L 
in  the  function  (LAMBDA  (Z)  (♦  Z L))  refers  to  the  value  of  L at  the  point  the 
LAMBDA-expression  is  evaluated.  In  a dynamically  scoped  language,  such  as 
standard  LISP,  the  L refers  to  the  most  recent  run-time  binding  of  L,  which  is 
the  binding  in  the  definition  of  MAPCAR  (which  occurs  between  the  time  the 
LAMBDA-expression  is  passed  to  HAPCAR  and  the  time  the  LAMBDA-expression  is 
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invoked).  ^ 

j 

1 

{Note  Generalized  LABELS) 

Since  the  dissertation  was  written,  and  indeed  after  [Revised  Report] 
came  out,  the  format  of  LABELS  in  SCHEME  was  generalized  to  permit  labelled  < 

functions  to  be  defined  using  any  of  the  same  three  formats  permitted  by  DEFINE 
in  [Revised  Report].  RABBIT  has  been  updated  to  reflect  this  change,  and  the 
code  for  it  appears  in  the  Appendix.  1 

{Note  Heap-Allocated  Contours) 

RABBIT  maintains  heap-allocated  environments  as  a simple  chained  list  of 
variable  values.  However,  all  the  variables  which  are  added  on  at  once  as  a 
single  set  may  be  regarded  as  a new  "contour"  in  the  Algol  sense.  Such  contours 
could  be  heap-allocated  arrays  (vectors),  and  so  an  environment  would  be  a 
chained  list  of  such  little  arrays.  The  typical  Algol  implementation  technique 
using  a "display"  (a  margin  array  whose  elements  point  at  successive  elements 
(contours)  of  the  environment  chain)  is  clearly  applicable  here.  One  advantage 
of  the  list-of-all-values  representation  actually  used  in  RABBIT  is  that  null 
contours  automatically  add  no  content  to  the  environment  structure,  which  makes 
it  easier  to  recognize  later,  in  the  code  generator,  that  no  environment 
adjustments  are  necessary  in  changing  between  two  environments  which  differ  only 
by  null  contours  (see  the  code  for  ADJUST-KNOWNFN-CENV  in  the  Appendix). 
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{Note  Loop  Unrolling) 

In  the  case  of  a LABELS  used  to  implement  a loop,  the  substitution  of  a 
labelled  function  for  the  variable  which  names  it  would  constitute  an  instance  of 
loop  unrolling  [Allen  and  Cocke],  particularly  if  the  substitution  permitted 
subsequent  optimizations  such  as  eliminating  dead  code.  Here,  as  elsewhere,  a 
specific  optimization  technique  falls  out  as  a consequence  of  the  more  general 
technique  of  beta-conversion. 

(Note  Multiple-Argument  Continuations) 

One  could  easily  define  a SCHENE-like  language  in  which  continuations 
could  take  more  than  one  argument  (that  is,  functions  could  return  several 
values);  see  the  discussion  in  [Declarative].  We  have  elected  not  to  provide 
for  this  in  SCHEME  and  RABBIT. 


(Note  Non-deterministic  CPS  Conversion) 

As  with  optimization,  so  the  conversion  to  continuation-passing  style 
involves  decisions  which  ideally  could  be  made  non-deterministically.  The 
decisions  made  at  this  level  will  affect  later  decisions  involving  register 
allocation,  etc.,  which  cannot  easily  be  foreseen  at  this  stage. 
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{Note  Non-deterministic  Optimization} 

To  simplify  the  implementation,  RABBIT  uses  only  a deterministic  (and 
very  conservative)  optimizer.  Ideally,  an  optimizer  would  be  non-deterministic 
in  structure;  it  could  try  an  optimization,  see  how  the  result  interacted  with 
other  optimizations,  and  back  out  if  the  end  result  is  not  as  good  as  desired. 
We  have  experimented  briefly  with  the  use  of  the  AMORD  language  [Doyle]  to  build 
a non-deterministic  compiler,  but  have  no  significant  results  yet. 

We  can  see  more  clearly  the  fundamental  unity  of  macros  and  other 
optimizations  in  the  light  of  this  hypothetical  non-deterministic  implementation. 
Rather  than  trying  to  guess  ahead  of  time  whether  a macro  expansion  or 
optimization  is  desirable,  it  goes  ahead  and  tries,  and  then  measures  the  utility 
of  the  result.  The  only  difference  between  a macro  and  other  optimizations  is 
that  a macro  call  is  an  all-or-nothing  situation:  if  it  cannot  be  expanded  for 
some  reason,  it  is  of  infinite  disutility,  while  if  it  can  its  disutility  is 
finite.  This  leads  to  the  idea  of  non-deterministic  macro  expansions,  which  we 
have  not  pursued. 

{Note  Non-quoted  ASET} 

The  SCHEME  interpreter  permits  one  to  compute  the  name  of  the  variable, 
but  for  technical  and  philosophical  reasons  RABBIT  forbids  this.  We  shall  treat 
"ASET'"  as  a single  syntactic  object  (think  "ASETQ"). 

Hewitt  (private  communication)  and  others  have  objected  that  the  ASET 
primitive  is  "dangerous"  in  that  one  cannot  predict  what  variable  may  be 
clobbered,  and  in  that  it  makes  one  dependent  on  the  representation  of  variables 
(since  one  can  "compute  up"  an  arbitrary  variable  to  be  set).  The  first  is  a 
valid  objection  on  the  basis  of  programming  style  or  programming  philosophy. 
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(Indeed,  on  this  basis  alone  it  was  later  decided  to  remove  ASET  from  the  SCHEME 
language,  leaving  only  ASET'  in  [Revised  Report].)  The  second  is  only  slightly 
true;  the  compiler  can  treat  ASET  with  an  non-quoted  first  argument  as  a sort  of 
macro.  Let  VI,  V2,  ....  VN  be  the  names  of  the  bound  variables  accessible  to  the 
occurrence  of  ASET  in  question.  These  names  are  all  distinct,  for  if  two  are  the 
same,  one  variable  "shadows"  another,  and  so  we  may  omit  the  one  shadowed  (and  so 
inaccessible).  Then  we  may  write  the  transformation: 

(ASET  a b)  =>  ((LAMBDA  (Q1  Q2) 

(COND  ((EQ  Q1  'VI)  (ASET'  VI  QZ)) 

( ( EQ  Q1  'VZ)  (ASET1  V2  QZ)) 

((EQ  Q1  'VN)  (ASET'  VN  Q2)) 

(T  (GLOBAL-SET  P Q1  Q2)))) 
a 
b) 

This  transformation  is  to  be  made  after  the  alpha-conversion  process,  which 
renames  all  variables;  Q1  and  QZ  are  two  more  generated  variables  guaranteed  not 

to  conflict  with  VI VN.  This  expansion  makes  quite  explicit  the  fact  that 

we  are  comparing  against  a list  of  symbols  to  decide  which  variable  to  modify. 
The  actual  run-time  representation  of  variables  is  not  exploited,  the  one 
exception  being  the  GLOBAL-SET  operator,  which  raises  questions  about  the  meaning 
of  the  global  environment  and  the  user  interface  which  we  are  not  prepared  to 
answer. 

(See  also  (Note  ASET'  Is  Imperative}.) 
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{Note  Old  CPS  Algorithm) 

We  reproduce  here  Appendix  A of  [Declarative]: 

Here  we  present  a set  of  functions,  written  in  SCHEME,  which  convert  a 
SCHEME  expression  from  functional  style  to  pure  continuation-passing  style. 
(Note  PLASMA  CPS) 

(ASET'  GENTEMPNUM  0) 

(OEFINE  GENTEMP 

(LAMBDA  (X) 

(IMPLODE  (CONS  X (EXPLOOEN  (ASET*  GENTEMPNUM  (♦  GENTEMPNUM  1))))))) 

GENTEMP  creates  a new  unique  symbol  consisting  of  a given  prefix  and  a unique 
number. 

(DEFINE  CPS  (LAMBDA  (SEXPR)  (SPRINTER  (CPC  SEXPR  NIL  ’(FCONT#)))) 

CPS  (Continuation-Passing  Style)  is  the  main  function;  its  argument  is  the 
expression  to  be  converted.  It  calls  CPC  (C-P  Conversion)  to  do  the  real  work, 
and  then  calls  SPRINTER  to  pretty-print  the  result,  for  convenience.  The  symbol 
#CONT#  is  used  to  represent  the  implied  continuation  which  is  to  receive  the 
value  of  the  expression. 
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(DEFINE  CPC 

(LAMBDA  ( SEXPR  ENV  CONT) 

(COND  ((ATOM  SEXPR)  (CPC-ATOM  SEXPR  ENV  CONT)) 

((EQ  (CAR  SEXPR)  'QUOTE) 

(IF  CONT  "(.CONT  .SEXPR)  SEXPR)) 

((EQ  (CAR  SEXPR)  -LAMBDA) 

(CPC-LAMBDA  SEXPR  ENV  CONT)) 

((EQ  (CAR  SEXPR)  'IF) 

(CPC- IF  SEXPR  ENV  CONT)) 

((EQ  (CAR  SEXPR)  'CATCH) 

(CPC-CATCH  SEXPR  ENV  CONT)) 

((EQ  (CAR  SEXPR)  'LABELS) 

(CPC-LABELS  SEXPR  ENV  CONT)) 

((AND  (ATOM  (CAR  SEXPR)) 

(GET  (CAR  SEXPR)  'AMACRO)) 

(CPC  (FUNCALL  (GET  (CAR  SEXPR)  'AMACRO)  SEXPR)  ENV  CONT)) 
(T  (CPC-FORM  SEXPR  ENV  CONT))))) 


CPC  merely  dispatches  to  one  of  a number  of  subsidiary  routines  based  on  the  form 
of  the  expression  SEXPR.  ENV  represents  the  environment  in  which  SEXPR  will  be 
evaluated;  it  is  a list  of  the  variable  names.  When  CPS  initially  calls  CPC, 
ENV  is  NIL . CONT  is  the  continuation  which  will  receive  the  value  of  SEXPR.  The 
double-quote  (M)  is  like  a single-quote,  except  that  within  the  quoted  expression 
any  subexpressions  preceded  by  comma  (,)  are  evaluated  and  substituted  in  (also, 
any  subexpressions  preceded  by  atsign  (0)  are  substituted  in  a list  segments). 
One  special  case  handled  directly  by  CPC  is  a quoted  expression;  CPC  also 
expands  any  SCHEME  macros  encountered. 


(DEF INE  CPC-ATOM 

(LAMBDA  (SEXPR  ENV  CONT) 

((LAMBDA  (AT)  (IF  CONT  "(.CONT  ,AT)  AT)) 

(COND  ((NUMBERP  SEXPR)  SEXPR) 

((MEMQ  SEXPR  ENV)  SEXPR) 

((GET  SEXPR  'CPS -NAME)) 

(T  ( IMPLQOE  (CONS  'X  (EXPLODEN  SEXPR)))))))) 


For  convenience,  CPC-ATOM  will  change  the  name  of  a global  atom.  Numbers  and 
atoms  in  the  environment  are  not  changed;  otherwise,  a specified  name  on  the 
property  list  of  the  given  atom  is  used  (properties  defined  below  convert 
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into  etc.);  otherwise,  the  name  is  prefixed  with  ■X".  Once  the  name  has 
been  converted,  it  is  converted  to  a form  which  invokes  the  continuation  on  the 
atom.  (If  a null  continuation  is  supplied,  the  atom  itself  is  returned.) 


(DEFINE  CPC-LAMBDA 

( LAMBDA  (SEXPR  ENV  CONT) 

((LAMBDA  (CN) 

((LAMBDA  (LX)  (IF  CONT  *( .CONT  ,LX)  LX)) 

"(LAMBDA  (l(CADR  SEXPR)  .CN) 

.(CPC  (CADDR  SEXPR) 

(APPEND  (CADR  SEXPR)  (CONS  CN  ENV)) 
CN)))) 

(GENTEMP  "C ) ) ) ) 


A LAMBDA  expression  must  have  an  additional  parameter,  the  continuation  supplied 
to  its  body,  added  to  its  parameter  list.  CN  holds  the  name  of  this  generated 
parameter.  A new  LAMBDA  expression  is  created,  with  CN  added,  and  with  its  body 
converted  in  an  environment  containing  the  new  variables.  Then  the  same  test  for 
a null  CONT  is  made  as  in  CPC-ATOM. 


(DEF INE  CPC- IF 

(LAMBDA  (SEXPR  ENV  CONT) 

((LAMBOA  (KN) 

"((LAMBDA  ( . KN) 

.(CPC  (CADR  SEXPR) 

ENV 

((LAMBDA  (PN) 

"(LAMBDA  (.PN) 

(IF  .PN 

.(CPC  (CADDR  SEXPR) 
ENV 
KN) 

.(CPC  (CADDDR  SEXPR) 
ENV 
**)))> 


(GENTEMP 


.CONT)) 


(GENTEMP  "P)))> 


First,  the  continuation  for  an  IF  must  be  given  a name  KN  (rather,  the  name  held 
in  KN;  but  for  convenience,  we  will  continue  to  use  this  ambiguity,  for  the  form 
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of  the  name  is  indeed  Kn  for  some  number  n),  for  it  Mill  be  referred  to  in  two 
places  and  we  wish  to  avoid  duplicating  the  code.  Then,  the  predicate  is 
converted  to  continuation-passing  style,  using  a continuation  which  will  receive 
the  result  and  call  it  PN.  This  continuation  will  then  use  an  IF  to  decide  which 
converted  consequent  to  invoke.  Each  consequent  is  converted  using  continuation 
KN. 


(DEFINE  CPC  -CATCH 

( LAMBDA  (SEXPR  ENV  CONI) 

( ( L AMBON  (EN) 

*( (LAMBDA  ( .EN) 

((LAMBDA  ( .(CADR  SEXPR)) 

.(CPC  (CA00R  SEXPR) 

(CONS  (CADR  SEXPR)  ENV) 
EN)) 

(LAMBDA  (V  C)  ( ,EN  V)))) 

.CONT)) 

(GENTEMP  ' E ) ) ) ) 


This  routine  handles  CATCH  as  defined  in  [Sussman  75],  and  in  converting  it  to 
continuation-passing  style  eliminates  all  occurrences  of  CATCH.  The  idea  is  to 
give  the  continuation  a name  EN,  and  to  bind  the  CATCH  variable  to  a continuation 
(LAMBDA  (VC)  ...)  which  ignores  its  continuation  and  instead  exits  the  catch  by 
calling  EN  with  its  argument  V.  The  body  of  the  CATCH  is  converted  using 
continuation  EN. 


(DEFINE  CPC-LABELS 

(LAMBOA  (SEXPR  ENV  CONT) 

(DO  ((X  (CADR  SEXPR)  (CDR  X)) 

(Y  ENV  (CONS  (CAAR  X)  Y))) 

((NULL  X) 

(00  ((W  (CADR  SEXPR)  (CDR  V)) 

(Z  NIL  (CONS  (LIST  (CAAR  V) 

(CPC  (CADAR  V)  Y NIL)) 

D)) 

((NULL  W) 

’(LABELS  .{REVERSE  Z) 

.(CPC  (CAODR  SEXPR)  Y CONT)))))))) 
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Here  we  have  used  DO  loops  as  defined  in  HacLISP  (DO  is  inplemented  as  a macro  in 
SCHEME).  There  are  two  passes,  one  performed  by  each  DO.  The  first  pass  merely 
collects  in  Y the  names  of  all  the  labelled  LAMBDA  expressions.  The  second  pass 
converts  all  the  LAMBDA  expressions  using  a null  continuation  and  an  environment 
augmented  by  all  the  collected  names  in  Y,  collecting  them  in  Z.  At  the  end,  a 
new  LABELS  is  constructed  using  the  results  in  Z and  a converted  LABELS  body. 


(DEFINE  CPC -FORM 

(LAMBDA  ( SEXPR  ENV  CONT) 

'(LABELS  ( ( LOOP  1 

(LAMBDA  (X  Y Z) 

(IF  (NULL  X) 

(DO  ((F  (REVERSE  (CONS  CONT  Y)) 

(IF  (NULL  (CAR  Z))  F 
(CPC  (CAR  Z) 

ENV 

*( LAMBDA.  (.(CAR  Y))  ,F)))) 
(Y  Y (COR  Y)) 

(Z  Z (COR  Z))) 

((NULL  Z)  F )) 

(COND  ((OR  (NULL  (CAR  X)) 

(ATOM  (CAR  X))) 

(LOOP1  (CDR  X) 

(CONS  (CPC  (CAR  X)  ENV  NIL)  Y) 
(CONS  NIL  Z))) 

((EQ  (CAAR  X)  ’QUOTE) 

( LOOP1  (CDR  X) 

(CONS  (CAR  X)  Y) 

(CONS  NIL  Z))) 

((EQ  (CAAR  X)  ’LAMBDA) 

(LOOP1  (CDR  X) 

(CONS  (CPC  (CAR  X)  ENV  NIL)  Y) 
(CONS  NIL  Z))) 

• (T  (LOOP1  (COR  X) 

(CONS  (GENTEMP  *T)  Y) 

(CONS  (CAR  X)  Z)))))))) 

( LOOP1  SEXPR  NIL  NIL)))) 


This,  the  most  complicated  routine,  converts  forms  (function  calls).  This  also 
operates  in  two  passes.  The  first  pass,  using  LOOP1,  uses  X to  step  down  the 
expression,  collecting  data  in  Y and  Z.  At  each  step,  if  the  next  element  of  X 
can  be  evaluated  trivially,  then  it  is  converted  with  a null  continuation  and 
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added  to  Y,  and  NIL  is  added  to  Z.  Otherwise,  a temporary  name  TN  for  the  result 
of  the  subexpression  is  created  and  put  in  Y,  and  the  subexpression  itself  is  put 
in  Z.  On  the  second  pass  (the  DO  loop),  the  final  continuation-passing  form  is 
constructed  in  F from  the  inside  out.  At  each  step,  if  the  element  of  Z is  non- 
null, a new  continuation  must  be  created.  (There  is  actually  a bug  in  CPC-FORn, 
which  has  to  do  with  variables  affected  by  side-effects.  This  is  easily  fixed  by 
changing  LOOP1  so  that  it  generates  temporaries  for  variables  even  though 
variables  evaluate  trivially.  This  would  only  obscure  the  examples  presented 
below,  however,  and  so  this  was  omitted.) 


(LABELS  ((BAR 

(LAMBDA  (OUMMY  X Y) 

(IF  (NULL  X)  1 |CPS  ready  to  go!| 

(BAR  (PUTPROP  (CAR  X)  (CAR  Y)  ’CPS-NAME) 
(CDR  X) 

(COR  Y)))))) 

(BAR  NIL 

'(+-*//  T NIL) 

•(«-+  --  *•  ////  'T  ’NIL))) 


This  loop  sets  up  some  properties  so  that  will  translate  into  instead  of 

"X+",  etc. 


Now  let  us  examine  some  examples  of  the  action  of  CPS.  First,  let  us  try 
our  old  friend  FACT,  the  iterative  factorial  program. 


(OEFINE  FACT 

(LAMBDA  (N) 

(LABELS  ( (FACT1  (LAMBOA  (H  A) 

(IF  (-  H •)  A 

(FACT1  (-  M 1)  (•  h A)))))) 

(FACT1  N X)))) 


Applying  CPS  to  the  LAMBDA  expression  for  FACT  yields: 
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( #CON  T # 

(LAMBDA  ( N C7) 

(LABELS  ((FACT1 

(LAMBDA  (M  A CIO) 

((LAMBDA  (Kll ) 

(X*  M 0 

(LAMBDA  (PI*) 

(IF  P12  (KU  A) 

" 1 

(LAMBDA  (TX3) 
(•»  H A 


CIO)))) 

(FACT1  N 1 C7 ) ) ) ) 


(LAMBDA  ( T14 ) 

(FACT1  T13  T14  Kll))))))))) 


As  an  example  of  CATCH  elimination,  here  is  a routine  which  is  a 
paraphrase  of  the  SQRT  routine  from  [Sussman  75]: 


(DEFINE  SORT 

(LAMBDA  (X  EPS) 

((LAMBDA  (ANS  LOOPTAG) 

(CATCH  RETURNTAG 

(BLOCK  (ASET1  LOOPTAG  (CATCH  M M)) 
(IF  — 

(RETURNTAG  ANS) 

NIL) 

(ASET*  ANS  •••) 

(LOOPTAG  LOOPTAG)))) 

10 

NIL))) 


Here  we  have  used  " — " and  "===■  as  ellipses  for  complicated  (and  relatively 
uninteresting)  arithmetic  expressions.  Applying  CPS  to  the  LAMBDA  expression  for 
SQRT  yields: 
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(#CONT# 

(LAMBDA  (X  EPS  C33) 

((LAMBDA  (ANS  LOOPTAG  C3«) 

((LAMBDA  (E3S) 

('LAMBDA  (RE TURN TAG) 

((LAMBDA  (ES2) 

((LAMBDA  (M)  (ES2  M)) 

(LAMBDA  (V  C)  (ES2  V)))) 

(LAMBOA  ( TS1 ) 

(XASET'  LOOPTAG  T51 
(LAMBDA  ( T37 ) 

((LAMBOA  (A  • C36 ) (B  C3*)) 

TIT 

(LAMBDA  (CAB) 

((LAMBOA  (K47) 

((LAMBOA  (PSB) 

(IF  PSB 

(RETURNTAG  ANS  K47) 

(K47  'NIL))) 

X — )) 

(LAMBOA  (T42) 

((LAMBOA  (A  B C41)  (B  C4I)) 

T42 

(LAMBOA  (C41) 

(XASET'  ANS  X— 

(LAMBOA  (T45) 

((LAMBDA  (A  B C44) 
(•  C44)) 

T45 

(LAMBOA  (C46) 
(LOOPTAG 
LOOPTAG 
C4S)) 

C4I)))) 

C4B)))) 

E»)))))) 

(LAMBDA  (V  C)  (E3S  V)))) 

C34  ) ) 

1.0 
'NIL 
C33  ) ) ) 


Note  that  the  CATCHes  have  both  been  eliminated.  It  is  left  as  an  exercise  for 


the  reader  to  verify  that  the  continuation-passing  version  correctly  reflects  the 
semantics  of  the  original. 


no 


{Note  Operations  on  Functions) 

It  would  certainly  be  possible  to  define  other  operations  on  functions, 
such  as  determining  the  number  of  arguments  required,  or  the  types  of  the 
arguments  and  returned  value,  etc.  (Indeed,  after  the  dissertation  was  written, 
it  was  decided  to  include  such  an  operator  PROCP  in  [Revised  Report].)  The  point 
is  that  functions  need  not  conform  to  a specific  representation  such  as  S- 
expressions.  At  a low  level,  it  may  be  useful  to  think  of  invocation  as  a 
generic  operator  which  dispatches  on  the  particular  representation  and  invokes 
the  function  in  an  appropriate  manner.  Similarly,  a debugging  package  might  need 
to  be  able  to  distinguish  the  various  representations.  At  the  user  level, 
however,  it  is  perhaps  best  to  hide  this  issue,  and  answer  a type  inquiry  with 
merely  "function". 

{Note  Refinement  of  RABBIT) 

Since  the  original  dissertation  was  written  I have  continued  to  refine 
and  improve  RABBIT.  This  effort  has  included  a complete  rewriting  of  the 
optimizer  to  make  it  more  efficienct  and  at  the  same  time  more  lucid.  It  also 
included  accommodation  of  changes  to  SCHEME  as  documented  in  [Revised  Report]. 
This  work  has  spanned  perhaps  eight  months'  time,  because  the  availability  of 
computer  time  restricted  me  to  testing  RABBIT  only  once  or  twice  a night.  Thus, 
the  actual  time  expended  for  the  improvements  was  much  less  than  ten  hours  a 


week . 


Ill 


(Note  Side-Effect  Classifications} 

The  division  of  side-effects  into  classes  in  RABBIT  was  not  really 
necessary  to  the  primary  goals  of  RABBIT,  but  was  undertaken  as  an  interesting 
experiment  for  our  own  edification.  One  could  easily  Imagine  a more  complex 
taxonomy.  A case  of  particular  interest  not  handled  by  RABBIT  is  dividing  the 
ASET  side-effect  into  ASET  of  each  particular  variable;  thus  an  ASET  on  FOO 
would  not  affect  a reference  to  the  variable  BAR.  This  could  have  been  done  in 
an  ad  hoc  manner,  but  we  are  interested  in  a more  general  method  dealing  only 
with  sets  of  effects  and  affectabilities. 

{Note  Subroutinization } 

We  have  not  said  anything  about  how  to  locate  candidate  expressions  for 
subroutinization.  For  examples  of  appropriate  strategies,  see  [Geschke]  and 
[Aho,  Johnson,  and  Ullman].  Our  point  here  is  that  SCHEME,  thanks  to  the 
property  of  lexical  scoping  and  the  ability  to  write  "anonymous"  functions  as 
LAMBDA-expressions,  provides  an  ideal  way  to  represent  the  result  of  such 
transformations. 
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{Note  Tail-Recursive  OR} 

Since  the  dissertation  was  written,  the  SCHEME  language  was  redefined  in 
[Revised  Report]  to  prescribe  a "tail-recursive"  interpretation  for  the  last  form 
in  an  AND  or  OR.  This  requirement  necessitated  a redefinition  of  OR  which  is  in 
fact  dual  to  the  definition  of  AND. 
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Appendix 


We  present  here  the  complete  working  source  code  for  RABBIT,  written  in 
SCHEME.  (The  listing  of  the  code  was  produced  by  the  "9”  listing  generator, 
written  by  Richard  N.  Stallman,  Guy  L.  Steele  Jr.,  and  other  contributors.) 

The  code  is  presented  on  successive  odd-numbered  pages.  Commentary  on 
the  code  is  on  the  facing  even-numbered  page.  An  index  appears  at  the  end  of  the 
listing,  indicating  where  each  function  is  defined. 


It  should  be  emphasized  that  RABBIT  was  not  written  with  efficiency  as  a 
particular  goal.  Rather,  the  uppermost  goals  were  clarity,  ease  of  debugging, 
and  adaptability  to  changing  algorithms  during  the  development  process  Much 
information  is  generated,  never  used  by  the  compilation  process,  and  then  thrown 
away,  simply  so  that  if  some  malfunction  should  occur  it  would  be  easier  to 
conduct  a post-mortem  analysis.  Information  that  is  used  for  compilation  is 
often  retained  longer  than  necessary.  The  overall  approach  is  to  create  a big 
data  structure  r d then,  step  by  step,  fill  in  slots,  never  throwing  anything 
away,  eve**  *'  may  no  longer  be  needed. 

The  ai9v.  i could  be  increased  in  speed,  particularly  the  optimizer, 
which  often  recomputes  information  needlessly.  Determining  whether  or  not  the 
recomputation  was  necessary  would  have  cluttered  up  the  algorithms,  however, 
making  them  harder  to  read  and  to  modify,  and  so  this  was  omitted.  Similarly, 
certain  improvements  could  dramatically  decrease  the  space  used.  The  larger 
functions  in  RABBIT  can  just  barely  be  compiled  with  a memory  size  of  256K  words 
on  a PDP-10.  However,  it  was  deemed  worthwhile  to  keep  the  extra  information 
available  for  as  long  a time  as  possible. 

The  implementation  of  RABBIT  has  taken  perhaps  three  man-months.  This 
includes  throwing  away  the  original  optimizer  and  rewriting  it  completely,  and 
accomodating  certain  changes  to  the  SCHEME  language  as  they  occurred.  RABBIT  was 
operational,  without  the  optimizer,  after  about  one  man-month's  work.  The 
dissertation  was  written  after  the  first  version  of  the  optimizer  was 
demonstrated  to  work.  The  remaining  time  was  spent  analyzing  the  faults  of  the 
first  optimizer,  writing  the  second  version,  accomodating  language  changes, 
making  performance  measurements,  and  testing  RABBIT  on  programs  other  than  RABBIT 
itself. 


The  main  modules  of  RABBIT  are  organized  something  like  this: 


COMFILE,  TRANSDUCE,  PROCESS-FORM 
COMPILE 

ALPHATIZE 

MACRO- EXPAND 
META-EVALUATE 

PASS 1 -ANALYZE 
ENV-ANALYZE 
TRIV-ANALYZE 
EFFS-ANALYZE 
META- IF-FUDGE 
META-COMBINATION-TRIVFN 
META-COMBINATION-LAMBDA 
SUBST-CANDIDATE 
META-SUBSTITUTE 

CONVERT 
CENV-ANALYZE 
BIND-ANALYZE 
DEPTH-ANALYZE 
CLOSE-ANALYZE 
COMP  I LATE -ONE -FUNCTION 
COMP  I LATE 

COMP-BODY 

ANALYZE 

TRIV-ANALYZE 


(Bookkeeping  and  file  handling) 

(Compile  a function  definition) 

(Convert  input,  rename  variables) 

(Expand  macro  forms) 
(Source-to-source  optimizations) 
(Preliminary  code  analysis) 
(Environment  analysis) 
(Triviality  analysis) 

(Side  effects  analysis) 
(Transform  nested  IF  expressions) 
(Constants  folding) 

( Beta-conversion ) 

(Substitution  feasibility) 
(Substitution,  subsumption) 
(Convert  to  continuation-passing  style) 
(Environment  analysis) 

(Bindings  analysis) 

(Register  allocation) 

(Environment  structure  design) 

(Generate  code,  producing  one  module) 
(Generate  code  for  one  subroutine) 
(Compile  procedure  body) 
(Generate  value-producing  code) 
(Generate  "trivial*  code) 
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The  DECLARE  forms  are  for  the  benefit  of  the  MacLISP  compiler,  which  will 
process  the  result  of  compiling  this  file  (i.e.  RABBIT  compiling  itself).  The 
first  few  forms  are  concerned  with  switch  settings,  allocation  of  memory  within 
the  MacLISP  compiler,  and  loading  of  auxiliary  functions  which  must  be  available 
at  compile  time. 

The  large  block  of  SPECIAL  declarations  contains  the  name  of  every  SCHEME 
function  in  the  file.  This  is  necessary  because  the  run-time  representation  of  a 
global  variable  is  as  a MacLISP  SPECIAL  variable.  The  compiled  function  objects 
will  reside  in  MacLISP  value  cells,  and  SCHEME  functions  refer  to  each  other 
through  these  cells. 

The  second  set  of  SPECIAL  declarations  (variables  whose  names  begin  and 
end  with  a "*")  specify  variables  used  globally  by  RABBIT.  These  fall  into  three 
categories:  variables  containing  properties  of  the  SCHEME  interpreter  which  are 
parameters  for  the  compiler  (e.g.  **ARGUMENT-REGISTERS**) ; switches,  primarily 
for  debugging  purposes,  used  to  control  certain  compiler  operations  (e.g. 
♦FUDGE*);  and  own  variables  for  certain  functions,  used  to  generate  objects  or 
gather  statistics  (e.g.  *GENTENPNUM*  and  *DEPROGNIFY-COUNT*) . 

The  PROCLAIM  forms  are  to  RABBIT  as  DECLARE  forms  are  to  the  MacLISP 
compiler.  These  provide  declarations  to  the  incarnation  of  RABBIT  which  is 
compiling  the  file.  The  subforms  of  a PROCLAIM  form  are  executed  by  RABBIT  when 
it  encounters  the  form  in  a file  being  compiled.  (We  will  see  later  how  this  is 
done . ) 
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; ; ; RABBIT  COMPILER 


(OECLARE  (FASLOAD  (QUUX)  SCHHAC ) ) 

(DECLARE  (MACROS  T)  (NEWIO  T)) 

(DECLARE  (ALLOC  ‘(LIST  (300000  450000  .2)  FIXNUM  50000  SYMBOL  24000))) 
(OECLARE  (DEFUN  DISPLACE  (X  Y)  Y)) 


(DECLARE  (SPECIAL  EMPTY  TRIVFN  GENTEMP  GENFLUSH  GEN-6L0BAL -NAME  PRINT-WARNING  ADDPROP  DELPROP  SETPROP 
ADJOIN  UNION  INTERSECT  REMOVE  SETD1FF  PAIRLIS  COMPILE  PASS1-ANALY2E  TEST-COMPILE 
NOD  IF Y ALPHATIZE  ALPHA-ATOM  ALPMA-LAMBOA  ALPHA- IF  ALPHA-ASET  ALPHA-CATCH 
ALPHA-LABELS  ALPHA -L ABELS -OEFN  ALPHA-BLOCK  MACRO-EXPAND  ALPHA-COMBINATION 
ENV-ANALYZE  TRIV-ANALYZE  TR I V- ANAL YZE -F N -P  EFFS-ANALYZE  EFFS-UNION  EFFS-ANALYZE- IF 
EFFS- ANAL YZE -COMB INAT ION  CHECK -COMB  I NAT  ION- PEFFS  ERASE-NODES  ME TA-EVALUATE 
META- IF -FUDGE  ME TA-COMB INAT ION- TRIVFN  ME  TA-COMB INAT ION  -LAMBOA  SUBST-CANDIDATE 
REANALYZE  1 EFFS- INTERSECT  EFFECTLESS  EFFECTLESS-EXCEPT-CONS  PASSABLE 
META-SUBSTITUTE  COPY-CODE  COPY-NOOES  CNOOIFY  CONVERT  MAKE-RETURN  CONVERT-LAMBDA-FM 
CONVERT- IF  CONVERT-ASET  CONVERT-CATCH  CONVERT-LABELS  CONVERT-COMBINATION 
CENV-ANALYZE  CENV-TRIV-ANALYZE  CENV-CCOMB I NAT  ION -ANALYZE  BIND-ANAL  YZE  REFO-VARS 
BIND-ANALYZE-CLAMBDA  B 1ND- ANALYZE -CONT 1NUAT  ION  BIND-ANAL YZE -CIF  BIND-ANAL  YZE-CASET 
BIND-ANALYZE-CLABELS  BIND -ANAL YZE -RE TURN  BIND-ANALYZE-CCOMB INAT  ION 
B I ND-CCOMB INAT ION -ANAL YZE  DEPTH- ANALYZE  F I L TER -CLOSE REFS  CLOSE -ANALYZE  COMPILATE 
OEPROGNIFY1  TEMPLOC  ENVCARCDR  REGSLIST  SET-UP-ASETVARS  COMP-BODY  PROOUCE-IF 
PRODUCE-ASET  PROOUCE -LABELS  PRODUCE -LAMBDA-COMB IVAT ION  PRODUCE  - TR 1VFN -COMB  INAT  ION 
PRODUCE -TR IVFN-COMB  INAT  ION-CONTINUAT  ION  PRODUCE -TR IVF  N-LCMB  INAT  ION-CVAR I ABLE 
PROOUCE -COMB  INAT  ION  PRODUCE -COMB INAT  ION-VAR IAULE  ADJUST -KNOWNFN-CENV 
PRODUCE -CONT I NUA r ION -RE TURN  PRODUCE-RETURN  PRODUCE-RETURN- 1 LAMBDACATE  PSETQIFY 
PSETOIFY-METHOS  ? PSETQIFY-METHOO-3  PSETQ-ARGS  PSETQ-ARGS-ENV  PSETO-TEMPS 
MAPANALYZE  ANALYZE  ANALYZE -CL AMBDA  ANALYZE -CONT I NUAT ION  ANALYZE-CIF  ANALYZE -CLABELS 
ANALYZE -CCOMB INAT IOM  ANALYZE -RETURN  LOOKUPICATE  CONS-CLOSEREFS  OUTPUT-ASET 
CONDICATE  OECARCORATE  TRIVIALIZE  TR IV-L AMBDACATE  COMPILATE -ONE -FUNCT ION 
COMPILATE-LOOP  USED- TEMPLOCS  REMARK-ON  MAP-USER-NAMES  COMFILE  TRANSDUCE 
PROCESS-FORM  PPOCESS-DEF INE-FORM  PROCESS-DEFINITION  CLEANUP  SEXPRFY  CSEXPRFY 
CHECK-NUMBER-OF  -ARGS  DUMPIT  STATS  RESET-STATS  INIT-RABBIT) ) 

(DECLARE  (SPECIAL  ‘EMPTY*  *GENTEMPNUM*  *GENTEMPL 1ST*  »GLOBAL-GEN-PREF  IX*  *ERROR-COUNT*  *ERROR-L  1ST* 
•TEST*  ‘TESTING*  ‘OPTIMIZE*  ‘REANALYZE*  ‘SUBSTITUTE*  *FU0GE*  »NEW-FUDGE* 

*S INGLE -SUBST*  *LAMBOA-SUBST*  *FLUSH-ARGS*  *STAT  VARS*  *DEAD-COUNT*  ‘FUDGE -COUNT* 
‘FOLO-COUNT*  *FLUSH-COUNT*  *CONVERT-COUNT*  *SUBST-COUNT«  *DEPROGN IF Y-COUNT* 

•LAMBDA -BODY-SUBST*  *LAMBDA-BOOY-SUBST- TRY-COUNT*  *LAMBOA-BOOY-SUBST-SUCCESS-COUNT* 
•CHECK-PEFFS*  “CONT+ARG-REGS**  **ENV+CONT+ARG-REGS*»  **ARGUM?NT-REG ISTFRS** 
«*NUMBER-OF - ARG-REGS**  ‘BUFFER -RANDOM -FORMS*  ‘DISPLACE -SW* ) ) 

(PROCLAIM  (*EXPR  PRINT-SHORT) 

(SET'  ‘BUFFER -RANDOM- FORMS*  NIL) 

(ALLOC  '(LIST  (240000  340000  1000)  FIXNUM  (30000  40000  1000) 

SYMBOL  (14000  24000  NIL)  HUNK4  (20000  53000  NIL) 

HUNK8  ( 20000  50000  NIL)  HUNK16  (20000  60000  NIL)))) 


(SET'  *STAT - VARS*  ' ( *DEAO-COUNT*  ‘FUDGE-COUNT*  *FOLD-COUNT*  ‘FLUSH-COUNT*  *CONVERT-COUNT* 
•SUBST-COUNT*  *DEPROGN IF Y-COUNT*  »LAMBDA-BOOY-SUBST- TRY-COUNT* 
•LAMBDA-BOOY-SUBST-SUCCESS-COUNT* ) ) 


(ALI.OC  '(LIST  (240000  340000  1000)  FIXNUM  (30000  40000  1000) 
SYMBOL  (14000  24000  NIL)  HUNK4  (20000  50000  NIL) 
HUNKS  (20000  50000  NIL)  HUNK16  (20000  70000  NIL))) 


(APPLY  'GCTWA  ' ( T ) ) 
(REPLACE  ) 

(SET*  ‘DISPLACE-SW*  NIL) 
(GRINDEF  ) 


;GC  USELESS  ATOMS  (CAN'T  SAY  (EVAL*  (6CTWA  T))  BECAUSE  OF  NCOMPLR) 
; UNDO  ANY  DISPLACED  MACROS 
; DON'T  LET  MACROS  SELF -DISPLACE 
;LOAD  THE  GRINDER  (PRETTY-PRINTER) 


(OECLARE 

(OECLARE 

(DECLARE 

(OECLARE 


(/ODEFINE  OEFINE  | SCHEME  FUNCTION!)) 
(/ODEFINE  DEFMAC  |MACL1SP  MACRO|)) 
(/ODEFINE  SCHMAC  |PDP-10  SCHEME  MACRO!)) 
(/ODEFINE  MACRO  | SCHEME  MACRO))) 


;OECLARATIONS  FOR  LISTING  PROGRAM 
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The  variable  *EMPTY*  is  initialized  to  a unique  object  (a  list  cell  whose 
car  is  *EMPTY*  --  this  is  so  that  no  other  object  can  be  EQ  to  it,  but  it  can  be 
easily  recognized  when  printed)  which  is  used  to  initialize  components  of 
structures.  (We  will  see  later  how  such  structures  are  defined.)  We  do  not  use, 
say,  NIL  to  represent  an  empty  component  because  NIL  might  be  a meaningful  value 
for  that  component.  The  predicate  EMPTY  is  true  of  the  unique  object. 

TR1VFN  is  a predicate  which  is  true  of  "trivial"  functions.  A function 
is  trivial  if  it  is  a MacLISP  primitive  (an  EXPR,  SUBR,  or  LSUBR),  or  has  been 
declared  to  be  primitive  via  a *EXPR  or  *LEXPR  proclamation. 

(INCREMENT  F00)  expands  into  the  code  (ASET'  F00  (♦  FOO  1)). 

CATENATE  is  a utility  macro  which  may  be  thought  of  as  a function.  Given 
any  number  of  S-expressions  it  produces  an  atomic  symbol  whose  print  name  is  the 
concatenation  of  the  print  names  of  the  S-expressions.  Usually  the  S-expressions 
will  be  atomic  symbols  or  numbers. 

(CATENATE  'FOO  '-  43)  =>  FOO-43 


• GENTEMP  is  used  to  generate  a new  unique  symbol,  given  a specified 
prefix.  The  global  variable  *GENTEMPNUM*  starts  at  zero  and  increases 
monotonicially.  Each  call  to  GENTEMP  catenates  the  prefix,  a hyphen,  and  a new 
value  of  *GENTEMPNUM*.  Because  the  numeric  suffixes  of  the  generated  symbols 
increase  with  time,  one  can  determine  in  which  order  symbols  were  generated.  We 
also  will  use  different  prefixes  for  different  purposes,  so  that  one  can  tell 
which  part  of  the  compiler  generated  a given  symbol.  This  information  can  be 
invaluable  for  debugging  purposes;  from  the  names  of  the  symbols  appearing  in  a 
data  structure,  one  can  determine  how  that  structure  was  created  and  in  what 
order.  (The  generated  symbols  are  themselves  used  primarily  as  simple  markers, 
or  as  simple  structures  (property  lists).  The  use  of  the  print  names  amounts  to 
tagging  each  marker  or  structure  with  a type  and  a creation  timestamp.  A LISP- 
like  language  encourages  the  inclusion  of  such  information.) 

(GENTEMP  'NODE)  =>  N0DE-Z534 


A list  of  all  generated  symbols  is  maintained  in  *GENTEHPLIST*.  GENFLUSH 
can  be  called  to  excise  all  generated  symbols  from  the  MacLISP  obarray;  this  is 
periodically  necessary  when  compiling  a large  file  so  that  unneeded  symbols  may 
be  garbage-collected.  The  symbols  are  initially  interned  on  the  obarray  in  the 
first  place  for  ease  of  debugging  (one  can  refer  to  them  by  name  from  a debugging 
breakpoint).  GEN-GLOBAL-NAME  is  used  to  generate  a symbol  to  be  used  as  a run- 
time name  by  the  compiled  code.  The  prefix  for  such  names  is  initially  "7"  for 
testing  purposes,  but  is  initialized  by  the  file  transducer  as  a function  of  the 
name  of  the  file  being  compiled.  This  allows  separately  compiled  files  to  be 
loaded  together  without  fear  of  naming  conflicts. 
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(COND  ((NOT  (BOUNDP  ' ‘EMPTY*)) 

(SET'  ‘EMPTY*  (LIST  ‘‘EMPTY*)))) 

(DEFINE  EMPTY 

( LAMBOA  (X)  (EQ  X ‘EMPTY*))) 


(DEFINE  TRIVFN 

(LAMBOA  (SYM) 

(GETL  SYM  ' ( EXPR  SUBR  LSUBR  *EXPR  *LEXPR)))) 


( OEFMAC  INCREMENT  (X)  "(ASET*  .X  (♦  .X  1))) 

( DEFMAC  CATENATE  ARGS 

"( IMPLOOE  (APPEND  !(MAPCAR  ‘(LAMBDA  (X) 

(COND  ((OR  (ATOM  X)  (NOT  (EQ  (CAR  X)  'QUOTE))) 
"(EXPLOOEN  ,X)) 

(T  '(QUOTE  .(EXPLODEN  (CADR  X)))))) 

ARGS)))) 


(COND  ((NOT  (BOUNDP  ‘ *GENTEMPNUM‘ ) ) 

(SET1  ‘GENTEMPNUM*  0))) 

(COND  ((NOT  (BOUNDP  ■ ‘GENTEMPL 1ST* ) ) 

(SET*  *GENTEMPL  1ST*  NIL))) 

(DEFINE  GENTEMP 

(LAMBDA  (X) 

( BLOCK  (INCREMENT  ‘GENTEMPNUM*) 

(LET  ((SYM  (CATENATE  X ‘|-|  ‘GENTEMPNUM*))) 

(ASET'  *GENTEMPLIST*  (CONS  SYM  ‘GENTEMPLIST* ) ) SYM)))) 

(DEFINE  GENFLUSH 

(LAMBDA  ( ) 

(BLOCK  ( AMAPC  REMOB  *GENTEMPLIST*) 

(ASET'  *6ENTEMPLIST*  NIL)))) 

(DEFINE  GEN-GLOBAL-NAME 

(LAMBOA  ()  (GENTEMP  *GLOBAL -GEN - PREF IX* ) ) ) 

(SET*  *GLOBAL-GEN-PREF IX*  • | ? | ) 
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WARN  is  a macro  used  to  print  a notice  concerning  an  incorrect  program 
being  compiled.  It  generates  a call  to  PR  I NT- WARNING,  which  maintains  a count 
and  a list  of  the  error  messages,  and  prints  the  message,  along  with  any 
associated  useful  quantities. 

(WARN  | FOO  is  greater  than  BAR|  FOO  BAR) 

would  print  (assuming  the  values  of  FOO  and  BAR  were  43  and  15) 

;Warning:  FOO  is  greater  than  BAR 
; 43 
*.  15 

WARN  is  used  only  to  report  errors  in  the  program  being  compiled.  The  MacLlSP 
ERROR  function  is  used  to  signal  internal  inconsistencies  in  the  compiler. 


ASK  is  a macro  which  prints  a message  and  then  waits  for  a reply. 
Typically  NIL  means  "no",  and  anything  else  means  "yes*. 


SX  and  CSX  are  debugging  aids  which  print  intermediate  data  structures 
internal  to  the  compiler  in  a readable  form.  They  make  use  of  SPRINTER  (part  of 
the  Maclisp  GRIND  pretty-printing  package)  and  of  SEXPRFY  and  CSEXPRFY,  which  are 
defined  below. 


The  EQCASE  macro  provides  a simple  dispatching  control  structure.  The 
first  form  evaluates  to  an  item,  and  the  clause  whose  keyword  matches  the  item  is 
executed.  If  no  clause  matches,  an  error  occurs.  For  example: 

(EQCASE  TRAFFIC-LIGHT 

(RED  (PRINT  ’STOP)) 

(GREEN  (PRINT  ’GO)) 

(YELLOW  (PRINT  ’ACCELERATE)  (CRASH))) 

expands  into  the  code: 

(COND  ((EQ  TRAFFIC-LIGHT  ’RED)  (PRINT  ’STOP)) 

( ( EQ  TRAFFIC-LIGHT  ’GREEN)  (PRINT  ’GO)) 

((EQ  TRAFFIC-LIGHT  ’YELLOW)  (PRINT  ’ACCELERATE)  (CRASH)) 

(T  (ERROR  ' | Losing  EQCASE | TRAFFIC-LIGHT  ’FAIL-ACT))) 
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002 

(OEFMAC 

WARN  (MSG  . STUFF) 

003 

"(PRINT-WARNING  '.MSG  (LIST  OSTUFF))) 

004 

005 

(OEF INE 

PRINT-WARNING 

006 

(LAMBDA  (MSG  STUFF) 

007 

(BLOCK  (INCREMENT  *ERROR-COUNT* ) 

ooa 

(ASET'  *ERR0R-L1ST»  (CONS  (CONS  MSG  STUFF)  *ERROR-l 1ST* ) ) 

009 

(TYO  7 (SYMEVAL  'TYO))  ;BELL 

010 

(TERPRI  (SYMEVAL  'TYO)) 

Oil 

(PR INC  * | ;Warning:  | (SYMEVAL  ’TYO)) 

012 

(TYO  7 (SYMEVAL  'TYO))  ;BELL 

013 

(PR INC  MSG  (SYMEVAL  ’TYO)) 

014 

( AMAPC  PRINT-SHORT  STUFF)))) 

015 

016 

(OEFUN  PRINT-SHORT  (X) 

017 

((LAMBDA  (PRIMLEVEL  PRINLENGTH  TERPR1 ) 

oia 

(TERPRI  (SYMEVAL  ‘TYO)) 

019 

(PRINC  • j;  | (SYMEVAL  'TYO)) 

020 

( PRIN1  X (SYMEVAL  ’TYO))) 

021 

3 6 T)) 

022 

023 

024 

(SCHMAC 

ASK  (MSG) 

025 

"(BLOCK  (TERPRI)  (PRINC  * ,MSG ) (TYO  4S)  (READ))) 

026 

027 

026 

(DEFMAC 

SX  (X)  "(SPRINTER  (SEXPRFY  .X  NIL)))  ; DEBUGGING 

AID 

029 

(OEFMAC 

CSX  (X)  "(SPRINTER  (CSEXPRFY  ,X)))  ; DEBUGGING 

AID 

030 

031 

032 

(OEFMAC 

EQCASE  (OBJ  . CASES) 

033 

"(CONO  0 ( MAPCAR  '(LAMBDA  (CASE) 

034 

(OR  (ATOM  (CAR  CASE)) 

035 

(ERROR  ' (Losing  EQCASE  clausel)) 

036 

"((EQ  .OBJ  '.(CAR  CASE))  B(CDR  CASE))) 

037 

CASES) 

036 

(T  (ERROR  ' |Los1ng  EQCASE | .OBJ  'FAIL-ACT)))) 
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The  next  group  of  macros  implement  typed  data  structures  with  named 
components.  ACCESSFN,  CLOBBER,  and  HUNKFN  allow  definition  of  very  general 
structure  access  functions.  Their  precise  operation  is  not  directly  relevant  to 
this  exposition;  suffice  it  to  say  that  they  are  subsidiary  to  the  DEFTYPE  macro 
on  the  next  page. 


DEFTYPE  defines  structure  “data  types"  with  named  components.  These 
structures  are  implemented  as  MacLISP  hunks.  (A  hunk  is  essentially  a kind  of 
list  cell  with  more  than  two  pointer  components;  it  may  be  thought  of  as  a 
short,  fixed-length  vector.  Hunks  are  accessed  with  the  function  (CXR  n hunk), 
which  returns  the  nth  component  of  the  hunk.  (RPLACX  n hunk  newval)  analogously 
alters  the  nth  component.  CXR  and  RPLACX  are  thus  similar  to  CAR/CDR  and 
RPLACA/RPLACD. ) 

Slot  0 of  each  hunk  is  reserved  for  a "property  list";  this  feature  is 
not  used  in  RABBIT.  Slot  1 always  contains  an  atomic  symbol  which  is  the  name  of 
the  type.  Thus  every  structure  explicitly  bears  its  type.  The  form  (HUNKFN  TYPE 
1)  creates  a function  (actually  a macro)  called  TYPE  which  when  applied  to  a hunk 
will  fetch  slot  1.  Slots  Z upward  of  a hunk  are  used  to  contain  named 
components.  A structure  does  not  contain  the  component  names.  (However,  the 
symbol  which  is  the  name  of  the  type  does  have  a list  of  the  component  names  on 
its  property  list.  This  is  useful  for  debugging  purposes.  There  is,  for 
example,  a package  which  pretty-prints  structured  data  types,  showing  the 
components  explicitly  as  name-value  pairs,  which  uses  this  information.) 


RABBI T 568  05/15/78 Page  4 


001 

002 

003 

004 

005 

006 

007 

008 

009 

010 

011 

012 

013 

014 

015 

016 

017 

018 

019 

020 
021 
022 

023 

024 

025 

026 

027 

028 

029 

030 

031 

032 

033 


( DECLARE  (/^DEFINE  ACCESSFN  | ACCESS  MACROI)) 

(DEFMAC  ACCESSFN  (NAME  UVARS  FETCH  . PUT) 

((LAMBDA  (VARS  CNAME ) 

(DO  ((A  VARS  (COR  A)) 

(B  '*Z*  "(COR  , B ) ) 

(C  NIL  (CONS  "(CAR  .B)  C))) 

((NULL  A) 

"(PROGN  'COMPILE 

(DEFMAC  .NAME  *Z* 

((LAMBDA  ,(NRE VERSE  (COR  (REVERSE  VARS))) 

.FETCH) 

!( REVERSE  (CDR  C)))) 

(DEFMAC  .CNAME  *Z* 

((LAMBDA  .VARS 

.(COND  (PUT  (CAR  PUT)) 

(T  "(CLOBBER  ..FETCH 

.THE-NEW-VAIUE)))) 

KREVERSE  C))))))) 

(COND  (PUT  UVARS) 

(T  (APPEND  UVARS  ' ( THE -NEV-VALUE ) ) ) ) 

(CATENATE  '|CLOBBER-|  NAME))) 

(DEFMAC  CLOBBER  (X  Y) 

"(.(CATENATE  ' ICLOBBER- | (CAR  X))  f(C0R  X)  ,Y)) 


(DECLARE  (/POEFINE  HUNKFN  | HUNK  ACCESS  MACRO | ) ) 


(DEFMAC  HUNKFN  (NAME  SLOT) 

"(ACCESSFN  .NAME  (THE-HUNK  NEV-VALUE) 

"(CXR  ..SLOT  .THE-HUNK) 

"( RPLACX  ..SLOT  .THE-HUNK  .NEV-VALUE))) 
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Consider  for  example  the  form 

(DEFTYPE  LAMBDA  (UVARS  VARS  BODY)) 

This  defines  a structured  data  type  called  LAMBDA  with  three  named  components 
UVARS,  VARS,  and  BODY.  It  also  defines  a series  of  macros  for  manipulating  this 
data  type. 

For  access,  the  macros  LAMBDANUVARS,  LAMBDA\VARS,  and  LAMBDA\BODY  are 
defined.  These  each  take  a single  argument,  a data  structure  of  type  VARIABLE, 
and  return  the  appropriate  component.  (The  TYPE  function  can  also  be  applied  to 
the  data  object,  and  will  return  LAMBDA.) 

For  construction,  a macro  CONS-LAMBDA  is  defined.  For  example,  the  form: 

(CONS-LAMBDA  (UVARS  = LIST1) 

(VARS  = LISTZ ) ) 

would  construct  a LAMBDA  structure  with  the  TYPE,  UVARS,  VARS,  and  BODY  slots 
initialized  respectively  to  LAMBDA,  the  value  of  L1ST1,  the  value  of  LIST2,  and 
the  "empty  object"  (recall  the  EMPTY  predicate  above).  Any  component  names 
(possibly  none!)  may  be  initialized  in  a CONS-xxx  form,  and  any  components  not 
mentioned  will  be  initialized  to  the  empty  object.  (The  "="  signs  are  purely 
syntactic  sugar  for  mnemonic  value.  They  can  be  omitted.) 

For  alteration  of  components,  a macro  ALTER-LAMBDA  is  defined.  For 

example,  the  form 

(ALTER-LAMBDA  FOO 

(UVARS  :=  L 1 ST 1 ) 

(BODY  :=  (LIST  A B))) 

would  alter  the  UVARS  and  BODY  components  of  the  value  of  FOO  (which  should  be  a 
LAMBDA  structure  - this  is  not  checked)  to  be  respectively  the  values  of  LIST1 
and  (LIST  A B).  Any  non-zero  number  of  components  may  be  modified  by  a single 
ALTF.R-xxx  form.  (The  ":  = " signs  are  purely  syntactic  sugar  also.) 

A great  advantage  of  using  these  structure  definitions  is  that  it  is  very 
easy  to  add  or  delete  components  during  the  development  of  the  program.  In 
particular,  when  a new  component  is  added  to  a type,  it  is  not  necessary  to  find 
all  instances  of  creations  of  objects  of  that  type;  they  will  simply 
automatically  initialize  the  new  slot  to  the  empty  object.  Only  parts  of  the 
program  which  are  relevant  to  the  use  of  the  new  component  need  be  changed. 
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(DECLARE  (/90EFINE  DEF TYPE  | DATA  TYPED) 

;;;  SLOT  0 IS  ALWAYS  THE  PROPERTY  LIST,  AND  SLOT  1 THE  HUNK  TYPE. 

(HUHKFN  TYPE  1) 

( DEF MAC  OEFTYPE  (NAME  SLOTS  SUPP) 

"(PROGN  'COMPILE 

( OEFMAC  .(CATENATE  ’ ICONS- 1 NAME)  KWDS 
(PROGN  (00  ((K  KWOS  (COR  K))) 

((NULL  K)) 

(OR  .(CONO  ((COR  SLOTS)  "(MEMQ  (CAAR  K)  \ SLOTS)) 

(T  *(EQ  (CAAR  K)  '.(CAR  SLOTS)))) 

(ERROR  '.(CATENATE  ' | Invalid  Keyword  Argument  to  CONS-I 
NAME) 

(CAR  K) 

■FAIL-ACT))) 

"(HUNK  '.'.NAME 

1(00  ( ( S 1 .SLOTS  (COR  S)) 

(X  NIL 

(CONS  ((LAMBDA  (KWO) 

(CONO  (KWO  (CAR  (LAST  KWO ) ) ) 

(T  '*EMPTY* ) ) ) 

(ASSQ  (CAR  S)  KWDS)) 

X)>) 

((NULL  S)  (NRE VERSE  X))) 

NIL))) 

(OEFMAC  .(CATENATE  ' | ALTER- | NAME)  (OBJ  . KWOS) 

(PROGN  (00  ((K  KWDS  (COR  K))) 

((NULL  K)) 

(OR  .(CONO  ((COR  SLOTS)  "(MEMO  (CAAR  K)  '.SLOTS)) 

(T  "(EQ  (CAAR  K)  '.(CAR  SLOTS)))) 

(ERROR  '.(CATENATE  '|Invel1d  Keyword  Argument  to  ALTER- 1 
NAME) 

(CAR  K) 

'FAIL-ACT))) 

(DO  ((I  (4  (LENGTH  KWOS)  1)  (-  I 1)) 

(VARS  NIL  (CONS  (GENSYM)  VARS))) 

(("  I 0) 

"((LAMBDA  .VARS 

.(BLOCKIFY 

(MAPCAR  '(LAMBDA  (K  V) 

"(CLOBBER  (.(CATENATE  '.NAME 

•I\l 

(CAR  K)) 
(.(CAR  VARS))) 
(.V))) 

KWDS 

(COR  VARS)))) 

(LAMBOA  ()  .OBJ) 

B( MAPCAR  '(LAMBDA  (K)  "(LAMBDA  ()  .(CAR  (LAST  K)))) 
KWDS)))))) 

9(00  ( ( S SLOTS  (COR  S)) 

(N  2 (♦  N 1)) 

(X  NIL  (CONS  "(HUNKFN  .(CATENATE  NAME  ' |\|  (CAR  S)) 

.N) 

X))) 

((NULL  S)  (NRE VERSE  X))) 

(OEFPROP  .NAME  .SLOTS  COMPONENT-NAMES) 

(OEFPROP  .NAME  .SUPP  SUPPRESSEO-COMPONENT-NAMES ) 

'(TYPE  .NAME  DEFINEO))) 
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On  this  page  are  two  groups  of  utility  functions.  One  group  manipulates 
property  lists,  and  the  other  manipulates  sets  of  objects  represented  as  lists. 


For  (ADDPROP  SYfl  VAL  PROP),  the  PROP  property  of  the  symbol  SYM  should  be 
a list  of  things.  The  object  VAL  is  added  to  this  list  if  it  is  not  already  a 
member  of  the  list. 

DELPROP  performs  the  inverse  of  ADDPROP;  it  removes  an  object  from  a 
list  found  as  the  property  of  a symbol. 

(SETPROP  SYM  VAL  PROP)  puts  the  property-value  pair  PROP, VAL  on  the 
property  list  of  SYM;  but  if  SYM  already  has  a PROP  property,  it  is  an  error 
unless  the  new  value  is  the  same  as  (EQ  to)  the  existing  one.  That  is,  a 
redundant  SETPROP  is  permitted,  but  not  a conflicting  one. 


(ADJOIN  ITEM  SET)  produces  a new  set  SET  U (ITEM). 

UNION  produces  the  union  of  two  sets. 

INTERSECT  produces  the  intersection  of  two  sets. 

(REMOVE  ITEM  SET)  produces  a new  set  SET  - (ITEM). 

(SETDIFF  SET1  SETZ)  produces  the  set  SET1  - SETZ. 

All  of  the  set  operations  are  accomplished  non-destructively ; that  is, 
the  given  arguments  are  not  modified.  Examples: 

(ADJOIN  'A  '(A  B C))  =>  (A  B C) 

(ADJOIN  'A  * ( B C D))  =>  (A  B C D) 

(UNION  '(A  B C)  * ( B D F))  =>  (D  F A B C) 

(INTERSECT  ’(A  B C)  '(B  D F))  = > (B) 

(REMOVE  'B  ’(A  B C))  =>  (A  C) 

(SETDIFF  '(A  B C)  '(B  D F))  =>  (A  C) 


001 
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002 

;;;  ADD  TO  A PROPERTY  WHICH  IS  A LIST  OF  THINGS 

003 

004 

(DEFINE  ADOPROP 

005 

(LAMBDA  (SYM  VAL  PROP) 

006 

(LET  ((L  (GET  SYM  PROP))) 

007 

( IF  (NOT  (MEMO  VAL  L)) 

006 

(PUTPROP  SYM  (CONS  VAL  L)  PROP))))) 

009 

010 

;;;  INVERSE  OF  ADOPROP 

Oil 

012 

(OEFINE  DELPROP 

013 

(LAMBDA  (SYM  VAL  PROP) 

014 

(PUTPROP  SYM  ( DELQ  VAL  (GET  SYM  PROP))  PROP))) 

015 

016 

;;;  LIKE  PUTPROP,  BUT  INSIST  ON  NOT  CHANGING  A VALUE  ALREADY  THERE 

017 

016 

(OEFINE  SETPROP 

019 

(LAMBDA  (SYM  VAL  PROP) 

020 

(LET  ((L  (GETL  SYM  (LIST  PROP)))) 

021 

(IF  (AND  L (NOT  (EO  VAL  (CADR  L ) ) ) ) 

022 

(ERROR  ' | At  tempt  to  redefine  a unique  propertyl 

023 

(LIST  'SETPROP  SYM  VAL  PROP) 

024 

•FAIL-ACT) 

025 

(PUTPROP  SYM  VAL  PROP))))) 

026 

\ 

027 

;;;  OPERATIONS  ON  SETS.  REPRESENTED  AS  LISTS 

026 

029 

(DEFINE  ADJOIN 

030 

(LAMBOA  (X  S) 

031 

( IF  (MEMO  X S)  S (CONS  X S)))) 

032 

033 

(OEFINE  UNION 

034 

(LAMBDA  (X  Y) 

035 

(DO  ((Z  Y (COR  Z)) 

036 

(V  X (ADJOIN  (CAR  Z)  V))) 

037 

((NULL  Z)  V)))) 

036 

0 39 

(OEFINE  INTERSECT 

040 

(LAMBOA  (X  Y) 

041 

(IF  (NULL  X) 

042 

NIL 

043 

( IF  (MEMO  (CAR  X)  Y) 

044 

(CONS  (CAR  X)  (INTERSECT  (COR  X)  Y)) 

045 

( INTERSECT  (CDR  X)  Y))))) 

046 

047 

(OEFINE  REMOVE 

046 

(LAMBDA  (X  S) 

049 

(IF  (NULL  S) 

050 

S 

051 

(IF  (EO  X (CAR  S)) 

052 

(COR  S) 

053 

((LAMBDA  (Y) 

054 

( IF  (EO  Y (CDR  S))  S 

055 

(CONS  (CAR  S)  Y))) 

056 

(REMOVE  X (COR  S))))))) 

057 

056 

(OEFINE  SETDIFF 

059 

(LAMBOA  (X  Y) 

060 

(DO  ((Z  X (CDR  Z)) 

061 

(W  NIL  ( IF  (MEMO  (CAR  Z)  Y) 

062 

W 

063 

(CONS  (CAR  Z>  W)))) 

064 

((NULL  Z)  V)))) 

0 

- • -^1 
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The  PAIRLIS  function  is  similar  to,  but  not  identical  to,  the  function  of 
the  same  name  in  the  LISP  1.5  Manual.  The  difference  is  that  the  pairs  of  the 
association  list  produced  are  2-lists  rather  than  single  conses.  This  was  done 
purely  so  that  structures  produced  by  PAIRLIS  would  be  more  readable  when 
printed;  the  ease  of  debugging  was  considered  worth  the  additional  CONS  and 
access  time. 

(PAIRLIS  '(A  B C)  '(X  Y Z)  ’ ( ( F P)  (G  Q))) 

=>  ( (C  Z)  (B  Y)  (A  X)  (F  P)  (G  Q)) 


The  COMPILE  function  is  the  main  top-level  function  of  the  compiler.  It 
is  responsible  for  invoking  each  phase  of  the  compiler  in  order.  NAME  is  the 
name  of  a function  (an  atomic  symbol),  and  LAMBDA-EXP  the  corresponding  lambda- 
expression;  these  are  easily  extracted,  for  example,  from  a SCHEME  DEFINE-f orm. 
SEF.-CRDD  is  NIL  for  normal  processing,  or  T for  debugging  purposes.  OPTIMIZE  is 
a switch  controlling  whether  the  optimization  phase  should  be  Invoked;  it  can  be 
T,  NIL,  or  MAYBE  (meaning  to  ask  the  (human)  debugger). 

The  overall  flow  within  COMPILE  is  as  follows:  check  number  of 
arguments;  apply  ALPHATIZE  to  the  lambda-expression  to  produce  the  pass  1 data 
structure;  optionally  optimize  this  data  structure;  perform  pass  1 analysis; 
convert  the  pass  1 data  structure  to  a pass  2 (continuation-passing  style)  data 
structure;  perform  pass  2 analysis;  generate  code.  The  value  of  COMPILE  is  the 
MacLISP  code  produced  by  the  code  generator. 


PASS1 -ANALYZE  is  a separate  function  so  that  it  can  be  used  by  the 
optimizer  to  re-analyze  newly  created  subexpressions. 


CL  is  a debugging  utility.  (CL  F00)  causes  the  function  FOO  (which 
should  be  defined  in  the  running  SCHEME  into  which  the  compiler  has  been  loaded) 
to  be  compiled.  Various  debugging  facilities,  such  as  SEE-CRUD,  are  enabled. 
This  is  done  by  using  TEST-COMPILE. 
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(DEFINE  PA1RLIS 

(LAMBDA  (LI  LZ  L) 

(DO  ((V  LI  (CDR  V)) 

(U  L2  (COR  U)) 

(E  L (CONS  (LIST  (CAR  V)  (CAR  U))  E))) 

((NULL  V)  E)))) 


(DEFINE  COMPILE 

(LAMBDA  (NAME  LAMBDA-EXP  SEE-CRUD  OPTIMIZE) 

(BLOCK  (CHECK-NUMBER-OF-ARGS  NAME 

(LENGTH  (CADR  LAMBDA-EXP)) 

T) 

(LET  ((ALPHA-VERSION  (ALPHATIZE  LAMBOA-EXP  NIL))) 

(IF  (AND  SEE-CRUD  (ASK  |See  alpha-conversion? | ) ) 

(SX  ALPHA- VERS  ION)) 

(LET  ((OPT  (IF  (EO  OPTIMIZE  'MAYBE) 

(ASK  | Optimize?! ) 

OPTIMIZE))) 

(LET  ((META-VERSION 
(IF  OPT 

(META-EVALUATE  ALPHA- VERS  ION ) 

(PASS1-ANALYZE  ALPHA-VERSION  NIL  NIL)))) 

(OR  (AND  (NULL  ( NODE\REFS  META-VERS ION ) ) 

(NULL  (NODENASETS  META-VERS ION )) ) 

(ERROR  ' | ENV- ANALYZE  lost  - COMPILE! 

NAME 

’FAIL-ACT)) 

(IF  (AND  SEE-CRUD  OPT  (ASK  |See  meta-evalua* 1on?|  ) ) 

(SX  META-VERS ION)) 

(LET  ( (CPS-VERSION  (CONVERT  META-VERSION  NIL  (NOT  (NULL  OPT))))) 
(IF  (AND  SEE-CRUD  (ASK  |See  CPS-convers Ion? ) ) ) 

(CSX  CPS-VERSION)) 

(CENV-ANALYZE  CPS-VERSION  NIL  NIL) 

(BIND-ANAL YZE  CPS-VERSION  NIL  NIL) 

(OEPTH-ANALYZE  CPS-VERSION  0) 

(CLOSE -ANALYZE  CPS-VERSION  NIL) 

( COMP IL ATE -ONE -FUNCTION  CPS-VERSION  NAME)))))))) 

(DEFINE  PASS1-ANALYZE 

(LAMBDA  (NODE  REOO  OPT) 

(BLOCK  (ENV-ANALYZE  NOOE  RED'"! 

(TRIV-ANALYZE  NODE  HtVj) 

(IF  OPT  (EFFS-ANALYZE  NOOE  REDO)) 

NOOE  ) ) ) 

(SCHMAC  CL  (FNNAME)  ’(TEST-COMPILE  \FNNAME)) 

(DEFINE  TEST-COMPILE 

(LAMBDA  (FNNAME) 

(LET  ((FN  (GET  FNNAME  ’ SCHEME  FUNCTION )) ) 

(COND  (FN  (ASET*  ‘TESTING*  T) 

(ASET1  ‘TEST*  NIL)  ; PURELY  TO  RELEASE  FORMER  6ARBAGE 

(ASET*  ‘ERROR-COUNT*  0) 

(ASET1  ‘ERROR-LIST*  NIL) 

(ASET-  ‘TEST*  (COMPILE  FNNAME  FN  T ’MAYBE)) 

(SPRINTER  ‘TEST* ) 

’(.(IF  (ZEROP  ‘ERROR-COUNT*)  ’NO  ‘ERROR-COUNT*)  ERRORS)) 

(T  *( .FNNAME  NOT  DEFINED)))))) 
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Here  are  the  structured  data  types  used  for  the  pass  1 intermediate 
representation.  Each  piece  of  the  program  is  represented  as  a NODE,  which  has 
various  pieces  of  information  associated  with  it.  The  FORM  component  is  a 
structure  of  one  of  the  types  CONSTANT,  VARIABLE,  LAMBDA,  IF,  ASET,  CATCH, 
LABELS,  or  COMBINATION.  This  structure  holds  information  specific  to  a given 
type  of  program  node,  whereas  the  NODE  structure  itself  holds  information  which 
is  needed  at  every  node  of  the  program  structure.  (One  may  think  of  the  FORfl 
component  as  a PASCAL  record  variant.) 

The  ALPHATIZE  routine  and  its  friends  take  the  S-expression  definition  of 
a function  (a  lambda-expression)  and  make  a copy  of  it  using  NODE  structures. 
This  copy,  like  the  S-expression,  is  a tree.  Subsequent  analysis  routines  will 
all  recur  on  this  tree,  passing  information  up  and  down  the  tree,  either 
distributing  information  from  parent  node  to  child  nodes,  or  collating 
information  from  child  nodes  to  pass  back  to  parent  nodes.  Some  information  must 
move  laterally  within  the  tree,  from  branch  to  branch;  this  is  accomplished 
exclusively  by  using  the  property  lists  of  symbols,  usually  those  generated  for 
renamings  of  variables  (since  all  lateral  information  is  associated  with  variable 
references  - which  is  no  accident!). 

The  function  NODIFY  is  used  for  constructing  a node,  with  certain  slots 
properly  initialized.  In  particular,  the  METAP  slot  is  initialized  to  NIL, 
indicating  a node  not  yet  processed  by  META- EVALUATE;  this  fact  will  be  used 
later  in  the  optimizer.  A name  is  generated  for  the  node,  and  the  node  is  put  on 
the  property  list  of  the  name.  This  property  is  for  debugging  purposes  only; 
given  the  name  of  a node  one  can  get  the  node  easily.  The  name  itself  may  also 
be  used  for  another  purpose  by  CONVERT-COMBINATION,  to  represent  the  intermediate 
quantity  which  is  the  value  of  the  form  represented  by  the  node. 


;;  ALPHA-CONVERSION 


RABBIT  566  05/15/78  Page  6 


001 

002 

003 

004 

005 

006 
007 
00S 

009 

010 

011 

012 

013 

014 

015 

016 
017 
016 

019 

020 
021 
022 

023 

024 

025 

026 

027 

028 

029 

030 

031 

032 

033 

034 

035 

036 

037 
036 

039 

040 

041 

042 

043 

044 

045 

046 

047 
046 

049 

050 

051 

052 

053 

054 

055 

056 

057 

058 

059 

060 
061 
062 

063 

064 

065 

066 
067 


;;;  HERE  WE  RENAME  ALL  VARIABLES.  AND  CONVERT  THE  EXPRESSION  TO  AN  EQUIVALENT  TREE-LIKE  FORM 
;;;  WITH  EXTRA  SLOTS  TO  BE  FILLED  IN  LATER.  AFTER  THIS  POINT.  THE  NEW  NAMES  ARE  USED  FOR 
;;;  VARIABLES.  AND  THE  USER  NAMES  ARE  USED  ONLY  FOR  ERROR  MESSAGES  AND  THE  LIKE.  THE  TREE-LIKE 
;;;  FORM  WILL  BE  USED  AND  AUGMENTED  UNTIL  IT  IS  CONVERTED  TO  CONTINUATION-PASSING  STYLE. 


;;;  WE  ALSO  FIND  ALL  USER-NAMED  LAMBDA-FORMS  AND  SET  UP  APPROPRIATE  PROPERTIES. 

;;;  THE  USER  CAN  NAME  A LAMBDA-FORM  BY  WRITING  (LAMBDA  (X)  BODY  NAME). 

( OEF  TYPE  NODE  (NAME  SEXPR  ENV  REFS  ASETS  TRIVP  EFFS  AFFD  PEFFS  PAFFD  METAP  SUBSTP  FORM)  (SEXPR)) 
: NAME : A GENSYM  WHICH  NAMES  THE  NODE'S  VALUE 
;SEXPR : THE  S-EXPRESSION  WHICH  WAS  ALPHATIZED  TO  MAKE  THIS  NODE 
; (USED  ONLY  FOR  WARNING  MESSAGES  AND  DEBUGGING) 

;ENV : THE  ENVIRONMENT  OF  THE  NODE  (USED  ONLY  FOR  DEBUGGING) 

; REFS : ALL  VARIABLES  BOUND  ABOVE  AND  REFERENCED  BELOW  OR  BY  THE  NODE 

; ASETS : ALL  LOCAL  VARIABLES  SEEN  IN  AN  ASET  BELOW  THIS  NODE  (A  SUBSET  OF  REFS) 

; TR I VP : NON-NIL  IFF  EVALUATION  OF  THIS  NODE  IS  TRIVIAL 

;EFFS : SET  OF  SIDE  EFFECTS  POSSIBLY  OCCURRING  AT  THIS  NODE  OR  BELOW 

; AFFD : SET  OF  SIOE  EFFECTS  WHICH  CAN  POSSIBLY  AFFECT  THIS  NODE  OR  BELOW 

;PEFF5 : ABSOLUTELY  PROVABLE  SET  OF  EFFS 

.PAFFD:  ABSOLUTELY  PROVABLE  SET  OF  AFFD 

;METAP : NON-NIL  IFF  THIS  NODE  HAS  BEEN  EXAMINED  BY  THE  META-EVALUATOR 
; SUBSTP : F LAG  INDICATING  WHETHER  META- SUBSTITUTE  ACTUALLY  MADE  A SUBSTITUTION 
;FORM:  ONE  OF  THE  BELOW  TYPES 


( OEF  TYPE  CONSTANT  (VALUE)) 

;VALUE  : THE  S-EXPRESSION  VALUE  OF  THE  CONSTANT 
(DEFTYPE  VARIABLE  (VAR  GLOBALP)) 

; VAR  : THE  NEW  UNIQUE  NAME  FOR  THE  VARIABLE.  GENERATED  BY  ALPHATIZE. 

; THE  USER  NAME  AND  OTHER  INFORMATION  IS  ON  ITS  PROPERTY  LIST. 

;GLOBAlP:  NIL  UNLESS  THE  VARIABLE  IS  GLOBAL  (IN  WHICH  CASE  VAR  IS  THE  ACTUAL  NAME) 

( OEF  TYPE  LAMBOA  (UVARS  VARS  BODY)) 

.UVARS:  THE  USER  NAMES  FOR  THE  BOUND  VARIABLES  (STRICTLY  FOR  DEBUGGING  (SEE  SEXPRFY) ) 
.VARS.  A LIST  OF  THE  GENERATED  UNIQUE  NAMES  FOR  THE  BOUND  VARIABLES 
; BODY : THE  NODE  FOR  THE  BODY  OF  THE  LAMBOA -EXPRESS ION 
(DEFTYPE  IF  (PRED  CON  ALT)) 

;PRED : THE  NODE  FOR  THE  PREDICATE 

;CON : THE  NODE  FOR  THE  CONSEQUENT 

;ALT : THE  NOOE  FOR  THE  ALTERNATIVE 

(DEFTYPE  ASET  (VAR  BODY  GLOBALP)) 

; VAR : THE  GENERATED  UNIQUE  NAME  FOR  THE  ASET  VARIABLE 

; BODY : THE  NOOE  FOR  THE  BODY  OF  THE  ASET 

;GLOBALP : NIL  UNLESS  THE  VARIABLE  IS  GLOBAL  (IN  WHICH  CASE  VAR  IS  THE  ACTUAL  NAME) 
(DEFTYPE  CATCH  (UVAR  VAR  BODY)) 

;UVAR:  THE  USER  NAME  FOR  THE  BOUND  VARIABLE  (STRICTLY  FOR  DEBUGGING  (SEE  SEXPRFY)) 

;VAR : THE  GENERATED  UNIQUE  NAME  FOR  THE  BOUNO  VARIABLE 

;BOOY : THE  NODE  FOR  THE  BODY  OF  THE  CATCH 
(DEFTYPE  LABELS  (UFNVARS  FNVARS  FNDEFS  BOOY)) 

; UF  NVARS  : THE  USER  NAMES  FOR  THE  BOUND  LABELS  VARIABLES 
; FNVARS : A LIST  OF  THE  GENERATED  UNIQUE  NAMES  FOR  THE  LABELS  VARIABLES 
;FN0EFS : A LIST  OF  THE  NOOES  FOR  THE  LAMBDA-EXPRESSIONS 
; BOOY : THE  NODE  FOR  THE  BOY  OF  THE  LABELS 
(DEFTYPE  COMBINATION  (ARGS  WARNP ) ) 

; ARGS : A LIST  OF  THE  NODES  FOR  THE  ARGUMENTS  (THE  FIRST  IS  THE  FUNCTION) 

;WARNP : NON-NIL  IFF  CHECK-COMB FNAT ION-PEFFS  HAS  DETECTEO  A CONFLICT  IN  THIS  COMBINATION 

(DEFINE  NODIFY 

(LAMBOA  (FORM  SEXPR  ENV) 

(LET  ((N  ( CONS  - NOOE  (NAME  • (GENTEMP  ’NOOE)) 

(FORM  - FORM) 

(SEXPR  ■ SEXPR) 

(ENV  • ENV) 

(METAP  • NIL)))) 

(PUTPROP  (NOOEVNAME  N)  N ’NOOE) 

N))) 
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ALPHATIZE  takes  an  S-expression  to  convert,  and  an  environment.  The 
latter  is  a list  of  2-lists;  each  2-list  is  of  the  form  (user-name  new-name). 
This  is  used  for  renaming  each  variable  to  a unique  name.  The  unique  names  are 
generated  within  ALPHA-LAMBDA,  ALPHA-LABELS,  and  ALPHA-CATCH,  where  the  variable 
bindings  are  encountered.  The  new  name  pairings  are  tacked  onto  the  front  of  the 
then-current  environment,  and  the  result  used  as  the  environment  for  converting 
the  body. 

ALPHATIZE  merely  does  a dispatch  on  the  type  of  form,  to  one  of  the  sub- 
functions for  the  various  types.  It  also  detects  forms  which  are  really  macro 
calls,  and  expands  them  by  calling  MACRO-EXPAND,  which  returns  the  form  to  be 
used  in  place  of  the  macro  call.  (BLOCK  is  handled  as  a separate  special  case. 
In  the  interpreter,  BLOCK  is  handled  specially  rather  than  going  through  the 
general  MACRO  mechanism.  This  is  done  purely  for  speed.  Defining  BLOCK  as  a 
macro  in  the  compiler  can  confuse  the  interpreter  in  which  the  compiler  runs,  and 
so  it  was  decided  simply  to  handle  BLOCK  as  a special  case  in  the  compiler  also.) 
ALPHATIZE  allows  the  S-expression  to  contain  already  converted  code  in  the  form 
of  NJODEs;  this  fact  is  exploited  by  the  optimizer  (see  META-IF-FUDGE  below),  but 
has  no  use  in  the  initial  conversion. 

ALPHA-ATOM  creates  a CONSTANT  structure  for  numbers  and  the  special 
symbols  NIL  and  T.  Otherwise  a VARIABLE  structure  is  created.  If  the  symbol  (it 
better  be  a symbol!)  occurs  in  the  environment,  the  new-name  is  used,  and 
otherwise  the  symbol  itself.  The  slot  GLOBALP  is  set  to  T iff  the  symbol  was  not 
in  the  environment. 

ALPHA-LAMBDA  generates  new  names  for  all  the  bound  variables.  It  then 
converts  its  body,  after  using  PAIRLIS  to  add  the  user-name/new-name  pairs  to  the 
environment.  The  result  is  used  to  make  a LAMBDA  structure.  A copy  is  made  of 
the  list  of  variables  in  the  UVARS  slot;  it  must  be  copied  because  later  META- 
COMBINATION-LAMBDA may  splice  out  elements  of  that  list.  If  so,  it  will  also 
splice  out  corresponding  members  of  VARS,  but  that  list  was  freshly  consed  by 
ALPHA- LAMBDA. 
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;;;  ON  NOOE  NAMES  THESE  PROPERTIES  ARE  CREATEO: 

; ; ; NOOE  THE  CORRESPONDING  NOOE 

(DEFINE  AlPHATIZE 

(LAMBDA  (SEXPR  ENV) 

(COND  ((ATOM  SEXPR) 

(ALPHA-ATOM  SEXPR  ENV)) 

( ( HUNKP  SEXPR) 

(IF  (EQ  (TYPE  SEXPR)  'NOOE) 

SEXPR 

(ERROR  '(Peculiar  hunk  • ALPHATIZE | SEXPR  'FAIL-ACT))) 
((EO  (CAR  SEXPR)  ' QUOTE ) 

(NOOIFY  (CONS-CONSTANT  (VALUE  • (CAOR  SEXPR)))  SEXPR  ENV)) 
((EQ  (CAR  SEXPR)  'LAMBOA) 

(ALPHA-LAMBDA  SEXPR  ENV)) 

((EQ  (CAR  SEXPR)  'IF) 

(ALPHA- IF  SEXPR  ENV)) 

((EQ  (CAR  SEXPR)  'ASET) 

(ALPHA-ASET  SEXPR  ENV)) 

((EQ  (CAR  SEXPR)  'CATCH) 

(ALPHA-CATCH  SEXPR  ENV)) 

((EQ  (CAR  SEXPR)  'LABELS) 

(ALPHA-LABELS  SEXPR  ENV)) 

((EQ  (CAR  SEXPR)  'BLOCK) 

(ALPHA-BLOCK  SEXPR  ENV)) 

((AND  (ATOM  (CAR  SEXPR)) 

(EQ  (GET  (CAR  SEXPR)  'AINT)  ’AMACRO)) 

(ALPHATIZE  (MACRO-EXPAND  SEXPR)  ENV)) 

(T  (ALPHA-COMBINATION  SEXPR  ENV))))) 


(DEFINE  ALPHA-ATOM 

(LAMBOA  (SEXPR  ENV) 

(IF  (OR  (NUMBERP  SEXPR)  (NULL  SEXPR)  (EQ  SEXPR  *T)) 

(NODIFY  (CONS-CONSTANT  (VALUE  * SEXPR))  SEXPR  ENV) 

(LET  ((SLOT  ( ASSQ  SEXPR  ENV))) 

(NOOIFY  (CONS-VARIABLE  (VAR  - (IF  SLOT  (CADR  SLOT)  SEXPR)) 
(6L0BALP  • (NULL  SLOT))) 

SEXPR 

ENV))))) 


(DEFINE  ALPHA-LAMBDA 

(LAMBDA  (SEXPR  ENV) 

(LET  ((VARS  (DO  ((I  (LENGTH  (CADR  SEXPR))  (-  I 1)) 

(V  NIL  (CONS  (6ENTEMP  'VAR)  V))j 
((■=  I 0)  (NREVERSE  V))))) 

(IF  (CDDOR  SEXPR) 

(WARN  | Mai  formed  LAMBOA  expression!  SEXPR)) 

(NODIFY  (CONS-LAMBDA  (UVARS  « (APPEND  (CADR  SEXPR)  NIL)) 

; ;SEE  META-COMBINATION-LAMBDA 
(VARS  « VARS) 

( BOOY  • (ALPHATIZE  (CADDR  SEXPR) 

(PAIRLIS  (CADR  SEXPR) 
VARS 
ENV)))) 

SEXPR 

ENV)))) 
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ALPHA- IF  simply  converts  the  predicate,,  consequent,  and  alternative,  and 
makes  an  IF  structure. 

ALPHA-ASET  checks  for  a non-quoted  first  argument.  (Presently  RABBIT 
does  not  allow  for  computed  ASET  variables.  Since  RABBIT  was  written,  such 
computed  variables  have  in  fact  been  banned  from  the  SCHEME  language  [Revised 
Report].)  For  simplicity,  it  also  does  not  allow  altering  a global  variable 
which  is  the  name  of  a MacLISP  primitive.  This  restriction  is  related  only  to 
the  kludginess  of  the  PDP-10  MacLISP  SCHEME  implementation,  and  is  not  an 
essential  problem  with  the  language.  The  ERROR  function  was  used  here  rather 
than  WARN  because  the  problems  are  hard  to  correct  for  and  occur  infrequently. 
Aside  from  these  difficulties,  ALPHA-ASET  is  much  like  ALPHA-ATOM  on  a variable; 
it  looks  in  the  environment,  converts  the  body,  and  then  constructs  an  ASET 
structure . 

ALPHA-CATCH  generates  a new  name  "CATCHVAR-nn"  for  the  bound  variable, 
tacks  it  onto  the  environment,  and  converts  the  body;  it  then  constructs  a CATCH 
structure . 

ALPHA-LABELS  generates  new  names  "FNVAR-n"  for  all  the  bound  variables; 
it  then  constructs  in  LENV  the  new  environment,  using  PAIRLIS.  It  then  converts 
all  the  bound  function  definitions  and  the  body,  using  this  environment.  In  this 
way  all  the  function  names  are  apparent  to  all  the  functions.  A LABELS  structure 
is  then  created. 
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(OEF1NE  ALPHA- IF 

(LAMBDA  (SEXPR  ENV) 

(NODIFY  (CONS- IF  (PRED  * (ALPHATIZE  (CADR  SEXPR)  ENV)) 

(CON  • (ALPHATIZE  (CADOR  SEXPR)  ENV)) 

(ALT  • (ALPHATIZE  (CAODDR  SEXPR)  ENV))) 

SEXPR 

ENV))) 

(DEFINE  ALPHA-ASET 

(LAMBDA  (SEXPR  ENV) 

(LET  ((VAR  (COND  ((OR  (ATOM  (CADR  SEXPR)) 

(NOT  (EQ  (CAADR  SEXPR)  'QUOTE))) 

(ERROR  *|Can't  Compile  Non-quoted  ASET  Variable! 

SEXPR 

•FAIL-ACT)) 

(T  (CADADR  SEXPR))))) 

(LET  ((SLOT  (ASSO  VAR  ENV))) 

(IF  (AND  (NULL  SLOT)  (TRIVFN  VAR)) 

(ERROR  ' (Illegal  to  ASET  a MacLISP  primitive! 

SEXPR 

•FAIL-ACT)) 

(NOOIFY  (CONS-ASET  (VAR  « (IF  SLOT  (CADR  SLOT)  VAR)) 

(GLOBALP  • (NULL  SLOT)) 

(BODY  > (ALPHATIZE  (CADOR  SEXPR)  ENV))) 

SEXPR 

ENV))))) 

(DEFINE  ALPHA-CATCH 

(LAMBDA  (SEXPR  ENV) 

(LET  ((VAR  (GENTEMP  ' C ATCHVAR ) ) ) 

(NOOIFY  (CONS-CATCH  (VAR  • VAR) 

(UVAR  • (CADR  SEXPR)) 

(BOOY  • (ALPHATIZE  (CADOR  SEXPR) 

(CONS  (LIST  (CADR  SEXPR)  VAR) 

ENV)))) 

SEXPR 

ENV)))) 

(DEFINE  ALPHA-LABELS 

(LAMBDA  (SEXPR  ENV) 

(LET  ((UFNVARS  ( AMAPCAR  (LAMBDA  (X) 

(IF  (ATOM  (CAR  X)) 

(CAR  X) 

(CAAR  X))) 

(CADR  SEXPR)))) 

(LET  ( (FNVARS  (DO  ((I  (LENGTH  UFNVARS)  (-  I 1)) 

(V  NIL  (CONS  (GENTEMP  ' FNVAR ) V))) 

((>  I 0)  ( NREVERSE  V))))) 

(LET  ((LENV  (PAIRLIS  UFNVARS  FNVARS  ENV))) 

(NOOIFY  (CONS -LABELS  (UFNVARS  • UFNVARS) 

(FNVARS  « FNVARS) 

(FNDEFS  • (AMAPCAR 

(LAMBDA  (X) 

( ALPHA -LABELS -OEFN  X LENV)) 
(CADR  SEXPR))) 

(BOOY  ■ (ALPHATIZE  (CADOR  SEXPR)  LENV))) 

SEXPR 

ENV)))))) 


ALPHA-LABELS-DEFN  parses  one  LABELS  definition  clause.  An  extension  to 
the  SCHEME  language  (made  just  after  the  publication  of  [Revised  Report]!) 
allows  a LABELS  definition  to  take  on  any  of  the  same  three  forms  permitted  by 
DEFINE.  Thus  this  LABELS  form  actually  defines  FOO,  BAR,  and  BAZ  to  be 
equivalent  functions: 

(LABELS  ((FOO  (LAMBDA  (X  Y)  ( BLOCK  (PRINT  X)  (♦  X Y) ) ) ) 

(BAR  (X  Y)  (PRINT  X)  (♦  X Y))  \ 

((BAZ  X Y)  (PRINT  X)  (♦  X Y))) 

(LIST  (FOO  1 2)  (BAR  1 2)  (BAZ  1 2))) 


ALPHA-BLOCK  implements  the  standard  macro  definition  of  BLOCK.  (BLOCK  x) 
is  simply  x,  and  (BLOCK  x . y)  expands  into: 

((LAMBDA  (A  B)  (B))  x (LAMBDA  ()  (BLOCK  . y) ) ) 


MACRO-EXPAND  takes  a macro  call  and  expands  it  into  a new  form  to  be  used 
in  place  of  the  macro  call.  In  the  PDP-10  MacLISP  SCHEME  implementation  there 
are  three  different  kinds  of  macros.  Types  MACRO  and  AMACRO  are  defined  by 
MacLISP  code,  and  so  their  defining  functions  are  invoked  using  the  MacLISP 
primitive  FUNCALL.  Type  SMACRO  is  defined  by  SCHEME  code  which  is  in  the  value 
cell  of  an  atomic  symbol;  thus  SYMEVAL  is  used  to  get  the  contents  of  the  value 
cell,  and  this  SCHEME  function  is  then  invoked. 


ALPHA-COMBINATION  converts  all  the  subforms  of  a combination,  making  a 
list  of  them,  and  creates  a COMBINATION  structure.  If  the  function  position 
contains  a variable,  it  performs  a consistency  check  using  CHECK-NUMBER-OF-ARGS 
to  make  sure  the  right  number  of  arguments  is  present. 
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(LAMBDA  ( LOEF  LENV) 

( ALPHATI2E  (IF  (ATOM  (CAR  LOEF)) 

(IF  (COOR  LOEF) 

'(LAMBDA  (CADR  LOEF)  , ( BLOCKIF Y (CDDR  LOEF  ) ) ) 
(CADR  LOEF)) 

'(LAMBDA  , (CDAR  LOEF)  .(BLOCKIFY  (COR  LOEF)))) 
LENV))) 


(DEFINE  ALPHA-BLOCK 

(LAMBDA  ( SEXPR  ENV) 

(COND  ((NULL  (COR  SEXPR)) 

(WARN  (BLOCK  with  no  f ormi | 

'(ENV  « , ( AMAPCAR  CAR  ENV))) 

(ALPHATIZE  NIL  ENV)) 

(T  (LABELS  ((MUNG 

(LAMBDA  (600Y) 

(IF  (NULL  (CDR  BODY)) 

(CAR  BODY) 

’((LAMBDA  (A  B)  ( B ) ) 

.(CAR  BOOY) 

(LAMBDA  ()  .(MUNG  (CDR  BODY)))))))) 
(ALPHATIZE  (HUNG  (CDR  SEXPR))  ENV)))))) 


(DEFINE  MACRO- EXPAND 

(LAMBDA  (SEXPR) 

(LET  ((M  (GETL  (CAR  SEXPR)  '(MACRO  AMACRO  SMACRO)))) 

(IF  (NULL  M) 

(BLOCK  (WARN  Imissing  macro  definition!  SEXPR) 

'(ERROR  'lUnderined  Macro  Form|  ' .SEXPR  'FAIL-ACT)) 
(EQCASE  (CAR  M) 

(MACRO  (F UNCALL  (CADR  M)  SEXPR)) 

(AMACRO  (FUNCALL  (CADR  M)  SEXPR)) 

(SMACRO  ((SYMEVAL  (CADR  M))  SEXPR))))))) 


(DEFINE  ALPHA-COMBINATION 

(LAMBDA  (SEXPR  ENV) 

(LET  ((N  (NOOIFY  (CONS-COMBINATION 
( WARNP  • NIL) 

(ARGS  > (AMAPCAR  (LAMBDA  (X)  (ALPHATIZE  X ENV)) 
SEXPR))) 

SEXPR 

ENV))) 

(LET  ((M  ( NODE \F ORM  (CAR  ( COMB INAT ION\ARGS  ( NOOEVFORM  N)))))) 
(IF  (AND  (EQ  (TYPE  M)  'VARIABLE) 

( VAR I ABLE \6L0BAL P M)) 

(CHECK-NUHBER-OF -ARGS 
(VAR  I ABLE  WAR  M) 

(LENGTH  (COR  ( COMB  I NAT I ON \ ARGS  (NOOEXFORM  N)))) 
«IL)) 

«)))> 
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Once  the  S-expression  function  definition  has  been  copied  as  a NODE  tree, 

COMPILE  calls  PASS1 -ANALYZE  to  fill  in  various  pieces  of  information.  (If 

optimization  is  to  be  performed,  COMPILE  instead  calls  META -EVALUATE . META- 

EVAl.UATE  in  turn  calls  PASS1-ANALYZE  in  a coroutining  manner  we  will  examine 

later.)  PASS  1 -ANALYZE  in  turn  calls  ENV-ANALYZE,  TRIV-ANALYZE,  and  EFFS-ANALYZE  | 

in  order.  Each  of  these  has  roughly  the  same  structure.  Each  takes  a node  and  a 

flag  called  REDOTHIS.  Normally  REDOTHIS  is  NIL  and  the  information  has  not  yet 

been  installed  in  the  node,  and  so  the  routine  proceeds  to  analyze  the  node  and 

install  the  appropriate  information. 

When  invoked  by  the  optimizer,  however,  there  may  be  information  in  the 
node  already,  but  that  information  may  be  incorrect  or  obsolete  as  a result  of 
the  optimizing  transformations.  If  REDOTHIS  is  non-NIL,  then  the  given  node  must 
be  reanalyzed,  even  if  the  information  is  already  present.  If  REDOTHIS  is  in 
fact  the  symbol  ALL,  then  all  descendants  of  the  given  node  must  be  reanalyzed. 

Otherwise,  only  the  given  node  requires  re-analysis,  plus  any  descendants  which  ; 

have  not  had  the  information  installed  at  all.  We  will  see  later  how  these 
mechanisms  are  used  in  the  optimizer. 

The  purpose  of  ENV-ANALYZE  is  to  fill  in  for  each  node  the  slots  REFS  and 
ASETS.  The  first  is  a set  (represented  as  a list)  of  the  new-names  of  all 
variables  bound  above  the  node  and  referenced  at  or  below  the  node,  and  the 
second  (a  subset  of  the  first)  is  a set  of  such  names  which  appear  in  an  ASET  at 
or  below  the  node.  These  lists  are  computed  recursively.  A CONSTANT  node  has  no 
such  references;  a VARIABLE  node  (with  GLOBAL?  = NIL)  refers  to  its  own 
variable.  An  ASET  node  adds  its  variable  to  the  ASET  list  for  its  body.  Most 
other  kinds  of  nodes  merely  merge  together  the  lists  for  their  immediate 
descendants.  In  order  to  satisfy  the  "bound  above  the  node"  requirement,  those 
structures  which  bind  variables  (LAMBDA,  CATCH,  LABELS)  filter  out  their  own 
bound  variables  from  the  two  sets. 

As  an  example,  consider  this  function: 

(LAMBDA  (X) 

((LAMBDA  (Y) 

((LAMBDA  (W) 

(ASET'  Z (*  X Y))) 

(ASET'  Y (-  Y 1)))) 

(-  X 3))) 

The  node  for  (-  X 3)  would  have  a REFS  list  (X)  and  an  ASET  list  ().  The  node 
for  the  ASET  on  Z would  have  REFS=(X  Y)  (or  perhaps  (Y  X))  and  ASETS=();  Z does 
not  appear  in  the  ASETS  list  because  it  is  not  bound  above.  The  node  for  the 
combination  ((LAMBDA  (W)  . . .)  ...)  would  have  REFS=(X  Y)  and  ASETS=(Y).  The 
node  for  the  lambda-expression  (LAMBDA  (Y)  ...)  would  have  REFSs(X)  and 

ASETS=(),  because  Y is  filtered  out. 


001 
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002  ; ; 

ENVIRONMENT  ANALYSIS. 

003 

004  ; ; 

FOR  NOOES  ENCOUNTEREO  WE  FILL  IN: 

005  ; ; 

REFS 

006  ; ; 

ASETS 

007  ;; 

ON  VARIABLE  NAMES  THESE  PROPERTIES  ARE  CREATED: 

008  ;; 

BINDING  THE  NODE  WHERE  THE  VARIABLE  IS  BOUND  ‘ 

009  ; ; 

USER-NAME  THE  USER'S  NAME  FOR  THE  VARIABLE  (WHERE  BOUND) 

010  ;; 

REAO-REFS  VARIABLE  NOOES  WHICH  READ  THE  VARIABLE 

Oil  ; ; 

WRITE -REFS  ASET  NODES  WHICH  SET  THE  VARIABLE 

012 

013  ;; 

NORMALLY.  ON  RECURRING  TO  A LOWER  NODE  WE  STOP  IF  THE  INFORMATION 

014  ; ; 

IS  ALREADY  THERE.  MAKING  THE  PARAMETER  "REDOTHIS"  BE  "ALL*  FORCES 

015  ;; 

RE-COMPUTATION  TO  ALL  LEVELS;  MAKING  IT  "ONCE"  FORCES 

016  ;; 

RECOMPUTATION  OF  THIS  NODE  BUT  NOT  OF  SUBNOOES. 

017 

016  (DEFINE  ENV-ANALYZE  j 

019 

(LAMBDA  (NODE  REDOTHIS) 

020 

(IF  (OR  REDOTHIS  (EMPTY  ( NODEXREFS  NODE))) 

021 

(LET  ((FM  (NOOEXFORM  NODE)) 

022 

(REDO  (IF  (EQ  REDOTHIS  ’ALL)  'ALL  NIL))) 

023 

(EQCASE  (TYPE  FM) 

024 

(CONSTANT  } 

025 

(ALTER-NOOE  NODE 

026 

(REFS  :*  NIL) 

027 

(ASETS  :«  NIL))) 

028 

(VARIABLE 

029 

(AOOPROP  (VARIABLEWAR  FM)  NOOE  'READ-REFS) 

030 

(IF  ( VAR I ABLE\GLOBALP  FM) 

031 

( SETPROP  (VARIABLEWAR  FM)  (VARIABLEWAR  FM)  ‘USER-NAME)) 

032 

(ALTER-NOOE  NOOE 

033 

(REFS  :«  (AND  (NOT  ( VAR 1 ABLEXGLOBALP  FM)) 

034 

(LIST  (VARIABLEWAR  FM)))) 

035 

(ASETS  :•  NIL))) 

036 

(LAMBDA 

037 

(DO  ((V  ( LAM8DAWARS  FM)  (CDR  V)) 

038 

(UV  (LAMBDAXUVARS  FM)  (CDR  UV))) 

039 

((NULL  V)) 

040 

(SETPROP  (CAR  V)  (CAR  UV)  'USER-NAME) 

041 

(SETPROP  (CAR  V)  NODE  'BINDING)) 

042 

(LET  ((B  (LAMBDAXBOOY  FM))) 

043 

(ENV-ANALYZE  B REDO) 

044 

(ALTER-NOOE  NODE 

045 

(REFS  (SETOIFF  (NODEXREFS  B) 

046 

( LAMBOAWARS  FM))) 

047 

(ASETS  :•  (SETDIFF  (NODEXASETS  B) 

048 

(LAMBOAWARS  FM)))))) 

049 

OF 

050 

(LET  ((PREO  (IFXPREO  FM)) 

051 

(CON  (IFXCON  FM)) 

052 

(ALT  ( IF  SALT  FM))) 

053 

(ENV-ANALYZE  PREO  REDO) 

054 

(ENV-ANALYZE  CON  REDO) 

055 

(ENV-ANALYZE  ALT  REDO) 

056 

(ALTER-NOOE  NOOE 

057 

(REFS  :•  (UNION  (NODEXREFS  PREO) 

056 

(UNION  (NODEXREFS  CON) 

059 

(NOOEXREFS  ALT)))) 

060 

(ASETS  :•  (UNION  (NOOEXASETS  PRED) 

061 

(UNION  (NODEXASETS  CON) 

062 

(NOOEXASETS  ALT))))))) 

063 

(ASET 

064 

(LET  ((6  (ASETXBOOY  FM)) 

065 

(V  (ASETXVAR  FM))) 

066 

(ENV-ANALYZE  B REDO) 

067 

(AOOPROP  V NOOE  ’WRITE-REFS) 

066 

(IF  ( ASETXGLOBALP  FM) 

069 

(ALTER-NOOE  NOOE 

« 

144 


It  should  be  easy  to  see  the  the  topmost  node  of  the  node-tree  must  have 
REFS=()  and  ASETS=(),  because  no  variables  are  bound  above  it.  This  fact  is  used 
in  COMPILE  for  a consistency  check.  (After  writing  this  last  sentence,  1 noticed 
that  in  fact  this  consistency  check  was  not  being  performed,  and  that  it  was  a 
good  idea.  On  being  installed,  this  check  immediately  caught  a subtle  bug  in  the 
optimizer.  Consistency  checks  pay  off!) 

Another  purpose  accomplished  by  ENV-ANALYZE  is  the  installation  of 
several  useful  properties  on  the  new-names  of  bound  variables.  Two  properties, 
READ-REFS  and  WRITE-REFS,  accumulate  for  each  variable  the  set  of  VARIABLE  nodes 
which  refer  to  it  and  the  set  of  ASET  nodes  that  refer  to  it.  These  lists  are 
very  important  to  the  optimizer.  A non-empty  WRITE-REFS  set  also  calls  for 
special  action  by  the  code  generator. 

When  a LAMBDA  node  is  encountered,  that  node  is  put  onto  each  new-name 
under  the  BINDING  property,  and  the  user-name  is  put  under  the  USER-NAME 
property;  these  are  used  only  for  debugging. 
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(REFS  :*  (NOOENREFS  B)) 

(ASETS  :*  ( NODENASETS  B))) 

(ALTER -NODE  NODE 

(REFS  :>  (ADJOIN  V (NOOENREFS  6))) 

(ASETS  :«  (ADJOIN  V (NODENASETS  B))))))> 

(CATCH 

(LET  ( ( B (CATCHNBODY  FM)) 

(V  (CATCHNVAR  FM))) 

(SETPROP  V (CATCHNUVAR  FM)  ’USER-NAME) 

(SETPROP  V NODE  ’BINDING) 

(ENV-ANALYZE  B REDO) 

(ALTER -NODE  NOOE 

(REFS  :•  (REMOVE  V (NOOENREFS  B))) 

(ASETS  :*  (REMOVE  V (NODENASETS  8)))))) 


(LABELS 

(DO  ((V  (LA6ELSNFNVARS  FM)  (CDR  V)) 

(UV  ( LABELSNUF NVARS  FM)  (CDR  UV)) 

(D  (LABELSNF NDEFS  FM)  (CDR  D)) 

(R  NIL  (UNION  R (NOOENREFS  (CAR  0)))) 

(A  NIL  (UNION  A (NODENASETS  (CAR  D))))) 

((NULL  V) 

(LET  ((B  (LABELSNBOOY  FM))) 

(ENV-ANALYZE  B REDO) 

(ALTER-NOOE  NODE 

(REFS  :•  (SETDIFF 

(UNION  R (NOOENREFS  B)) 

( LABELSNF  NVARS  FM))) 
(ASETS  (SETDIFF 

(UNION  A (NODENASETS  B)) 
(LABELSNF NVARS  FM)))))) 
(SETPROP  (CAR  V)  (CAR  UV)  ‘USER-NAME) 

(SETPROP  (CAR  V)  NODE  ’BINDING) 

(ENV-ANALYZE  (CAR  D)  REDO))) 

(COMBINATION 

(LET  ((AR6S  (COMBINATIONNARGS  FM))) 

(AMAPC  (LAMBDA  (X)  (ENV-ANALYZE  X REDO))  ARGS) 

(DO  ((A  ARGS  (COR  A)) 

(R  NIL  (UNION  R (NOOENREFS  (CAR  A)))) 

(S  NIL  (UNION  S (NODENASETS  (CAR  A))))) 

((NULL  A) 

(ALTER-NOOE  NOOE 

(REFS  :•  R) 

(ASETS  :■  S))))>)>)))) 


TRIV-ANALYZE  fills  in  the  TRIVP  slot  for  each  node.  This  is  a flag 

which,  if  non-NIL,  indicates  that  the  code  represented  by  that  node  and  its 
descendants  is  "trivial",  i.e.  it  can  be  executed  as  simple  host  machine 

(MacLISP)  code  because  no  SCHEME  closures  are  involved.  Constants  and  variables 

are  trivial,  as  are  combinations  with  trivial  arguments  and  a provably  trivial 
function.  While  lambda-expressions  are  in  general  non-trivial  (because  a closure 
must  be  constructed),  a special  case  is  made  for  ((LAMBDA  ...)  ...),  i.e.  a 

combination  whose  function  is  a lambda-expression.  This  is  possible  because  the 
code  generator  will  not  really  generate  a closure  for  the  lambda-expression. 
This  is  the  first  example  of  a trichotomy  we  will  encounter  repeatedly. 
Combinations  are  divided  into  three  kinds:  those  with  a lambda-expression  in  the 
function  position,  those  with  a trivial  MacLISP  primitive  (satisfying  the 
predicate  TR1VFN)  in  the  function  position,  and  all  others. 

All  other  expressions  are,  in  general,  trivial  iff  all  their  subparts  are 
trivial.  Note  that  a LABELS  is  trivial  iff  its  body  is  trivial;  the  non- 
triviality of  the  bound  functions  does  not  affect  this. 

The  triviality  flag  is  used  by  phase  2 to  control  conversion  to 
continuation-passing  style.  This  in  turn  affects  the  code  generator,  which 
compiles  trivial  forms  straightforwardly  into  MacLISP  code,  rather  than  using  the 
more  complex  techniques  required  by  non-trivial  SCHEME  code.  It  would  be 
possible  to  avoid  triviality  analysis  entirely;  the  net  result  would  only  be 
less  optimal  final  code. 
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; ; ; TRIVIALITY  ANALYSIS 

;;;  FOR  NOOES  ENCOUNTERED  WE  FILL  IN: 

;;;  TRIVP 

;;;  A COMBINATION  IS  TRIVIAL  IFF  ALL  ARGUMENTS  ARE  TRIVIAL,  AND 
;;;  THE  FUNCTION  CAN  BE  PROVED  TO  BE  TRIVIAL.  WE  ASSUME  CLOSURES 
;;;  TO  BE  NON- TR I VI AL  IN  THIS  CONTEXT,  SO  THAT  THE  CONVERT  FUNCTION 
;;;  WILL  BE  FORCED  TO  EXAMINE  THEM. 

(DEFINE  TR  IV  ANALYZE 

(LAMBDA  (NODE  REDOTH IS ) 

(IF  (OR  REDOTHIS  (EMPTY  (NODE\TRIVP  NODE))) 

(LET  ((FM  (NOOEXFORM  NODE ) ) 

(REDO  (IF  (EQ  REDOTHIS  'ALL)  'ALL  NIL))) 

( EQCASE  (TYPE  FM) 

(CONSTANT 

( AL TER-NOOE  NODE  (TRIVP  :»  T) ) ) 

(VARIABLE 

(ALTER -NODE  NOOE  (TRIVP  T))) 

(LAMBDA 

f -AIV-ANALYZE  ( LAMBOF.XBODY  FM)  REDO) 

(ALTER-NODE  NOOE  (TRIVP  NIL))) 

(IP 

(TRIV-ANALYZE  ( IF\PRE0  FM)  REDO) 

(TRIV-ANALYZE  (IFXCON  FM)  REDO) 

(TRIV-ANALYZE  ( IF\ALT  FM)  REDO) 

(ALTER-NODE  NOOE 

(TRIVP  :•  (AND  (NODEXTRIVP  (IFXPREO  FM)) 
(NODEXTRIVP  (IFXCON  FM)) 
(NODEXTRIVP  (IFXALT  FM)))))) 

(ASET 

(TRIV-ANALYZE  ( ASETXBOOY  FM)  REDO) 

(ALTER-NODE  NODE  (TRIVP  (NODEXTRIVP  (ASETXBOOY  FM))))) 
(CATCH 

(TRIV-ANALYZE  (CATCHXBOOY  FM)  REDO) 

(ALTER-NODE  NOOE  (TRIVP  NIL))) 

(LABELS 

( AMAPC  (LAMBDA  (F)  (TRIV-ANALYZE  F REDO)) 

(LABELSXFNDEFS  FM)) 

(TRIV-ANALYZE  (LABELSXBOOY  FM)  REDO) 

(ALTER-NODE  NODE  (TRIVP  :«  NIL))) 

(COMBINATION 

(LET  ((ARGS  ( COMB INAT IONXARGS  FM))) 

(TRIV-ANALYZE  (CAR  ARGS)  REDO) 

(DO  ((/  (COR  ARGS)  (CDR  A)) 

(SW  T (AND  SW  (NODEXTRIVP  (CAR  A))))) 

((NULL  A) 

( ALTER-NOOE  NODE 

(TRIVP  :•  (AND  SW 

(TRIV-ANALYZE-FN-P 
(CAR  ARGS)))))) 

(TRIV-ANALYZE  (CAR  A)  REDO))))))))) 


(OEFINE  TRIV-ANALYZE-FN-P 
(LAMBDA  (FN) 

(OR  (ANO  (EQ  (TYPE  (NOUEXFORM  FN))  ’VARIABLE) 

( TRIVFN  ( VARIABLEXVAR  (NOOEXFORM  FN)))) 

(ANO  (EQ  (TYPE  (NOOEXFORM  FN))  'LAMBDA) 

(NOOEXTRIVP  ( LAM60AX600Y  (NOOEXFORM  FN))))))) 
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EFFS-ANALYZE  analyzes  the  code  for  side-effects.  In  each  node  the  four 
slots  EFFS,  AFFD,  PEFFS,  and  PAFFD  are  filled  in.  Each  is  a set  of  side  effects, 
which  may  be  the  symbol  NONE,  meaning  no  side  effects;  ANY,  meaning  all  possible 
side  effects;  or  a list  of  specific  side  effect  names.  Each  such  name  specifies 
a category  of  possible  side  effects.  Typical  names  are  ASET,  RPLACD,  and  FILE 
(which  means  input/output  transactions). 

The  four  slots  EFFS,  AFFD,  PEFFS,  and  PAFFD  refer  to  the  node  they  are  in 
and  all  nodes  beneath  it.  Thus  each  is  computed  by  taking  the  union  of  the 
corresponding  sets  of  all  immediate  descendants,  then  adjoining  any  effects  due 
to  the  current  node. 

EFFS  is  the  set  of  side  effects  which  may  possibly  be  caused  at  or  below 
the  current  node;  PEFFS  is  the  set  of  side  effects  which  can  be  proved  to  occur 
at  or  below  the  node.  These  may  differ  because  of  ignorance  on  RABBIT'S  part. 
For  example,  the  node  for  a combination  (RPLACA  A B)  will  have  the  side-effect 
name  RPI.ACA  adjoined  to  both  EFFS  and  PEFFS,  because  the  RABBIT  knows  that  RPLACA 
causes  an  RPLACA  side  effect  (how  this  is  known  will  be  discussed  later).  On  the 
other  hand,  for  a combination  (F00  A B),  where  F00  is  some  user  function,  RABBIT 
can  only  conjecture  that  FOO  can  cause  any  conceivable  side  effect,  but  cannot 
prove  it.  Thus  EFFS  will  be  forced  to  be  ANY,  while  PEFFS  will  not. 

AFFD  is  the  set  of  side  effects  which  can  possibly  affect  the  evaluation 
of  the  current  node  or  its  descendants.  For  example,  an  RPLACA  side  effect  can 
affect  the  evaluation  of  (CAR  X),  but  on  the  other  hand  an  RPLACD  side  effect 
cannot.  PAFFD  is  the  corresponding  set  of  side  effects  for  which  it  can  be 
proved.  (This  set  is  "proved"  in  a less  rigorous  sense  than  for  PEFFS.  The  name 
RPLACA  would  be  put  in  the  PAFFD  set  for  (CAR  X),  even  though  the  user  might  know 
that  while  there  are  calls  to  RPLACA  in  his  program,  none  of  them  ever  modify  X. 
PEFFS  and  PAFFD  are  only  used  by  CHECK-COMBINATION-PEFFS  to  warn  the  user  of 
potential  conflicts  anyway,  and  serve  no  other  purpose.  EFFS  and  AFFD,  on  the 
other  hand,  are  used  by  the  optimizer  to  prevent  improper  code  motion.  Thus  EFFS 
and  AFFD  must  be  pessimistic,  and  err  only  on  the  safe  side;  while  PEFFS  and 
PAFFD  are  optimistic,  so  that  the  user  will  not  be  pestered  with  too  many  warning 
messages . ) 

The  CONS  side  effect  is  treated  specially.  A node  which  causes  the  CONS 
side  effect  must  not  be  duplicated,  because  each  instance  will  create  a new 
object;  but  whereas  two  RPLACA  side  effects  may  not  be  executed  out  of  order, 
two  CONS  side  effects  may  be. 

The  computation  of  AFFD  and  PAFFD  for  variables  depends  on  whether  the 
variable  is  global  or  not.  If  it  is,  SETQ  and  RPLACD  can  affect  it  (RPLACD  can 
occur  because  of  the  peculiarities  of  the  PDP-10  NacLISP  implementation); 
otherwise,  ASET  can  affect  it  if  indeed  any  ASET  refers  to  it  (in  which  case  ENV- 
ANALYZE  will  have  left  a WRITE-REFS  property);  otherwise,  nothing  can  affect  it. 
Similar  remarks  hold  for  the  computation  of  EFFS  and  PEFFS  for  an  ASET  node.  The 
name  SETQ  applies  to  modifications  of  global  variables,  while  ASET  applies  to 
local  variables. 


001 
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1 002 

; ; ; siof 

-EFFECTS  ANALYSIS 

003 

;;;  for 

NOOES  ENCOUNTEREO  WE  FILL  IN:  EFFS.  AFFD,  PEFFS , PAFFO 

004 

;;;  A SET  OF  SIDE  EFFECTS  MAY  BE  EITHER  'NONE  OR  'ANY,  OR  A SET. 

005 

006 

( OEF  INE 

EFFS-ANALYZE 

007 

(LAMBDA  ( NOOE  REOOTHIS) 

008 

(IF  (OR  REOOTHIS  (EMPTY  (NOOE\EFFS  NODE))) 

009 

(LET  ((FM  (NOOENFORM  NODE)) 

010 

(REDO  (IF  (EQ  REOOTHIS  ’ALL)  'ALL  NIL))) 

on 

(EQCASE  (TYPE  FM) 

012 

(CONSTANT 

013 

(ALTER -NOOE  NOOE 

014 

(EFFS  ’NONE) 

{ 015 

(AFFD  :«  ’NONE) 

016 

(PEFFS  :*  'NONE) 

017 

(PAFFO  :*  ’NONE))) 

018 

(VARIABLE 

019 

(LET  ((A  (CONO  ( ( VARIABLE\GLOBALP  FM)  '(SETQ)) 

| 0 20 

( (6ET  (VARIABLEWAR  FM)  'WRITE-REFS)  '(ASET)) 

021 

(T  'NONE)))) 

! 022 

( ALTER-NOOE  NODE 

023 

(EFFS  :*  'NONE) 

024 

(AFFD  :»  A) 

025 

(PEFFS  :*  'NONE) 

026 

(PAFFO  :*  A)))) 

027 

(LAMBDA 

028 

(EFFS-ANALYZE  ( LAMBDANBODY  FM)  REDO) 

029 

(ALTER-NOOE  NODE 

030 

(EFFS  :*  '(CONS)) 

031 

(AFFO  :*  NIL) 

032 

(PEFFS  :*  ’(CONS)) 

033 

(PAFFO  :*  NIL))) 

034 

(IF  (EFFS-ANALYZE- IF  NOOE  FM  REOO)) 

035 

(ASET 

036 

(EFFS-ANALYZE  (ASETNBODY  FM)  REOO) 

037 

(LET  ((ASETEFFS  (IF  (ASET\GLOBALP  FM) 

0 38 

'(SETQ) 

039 

'(ASET)))) 

040 

(ALTER-NOOE  NODE' 

04  1 

(EFFS  :*  (EFFS-UNION  ASETEFFS 

042 

(NOOENEFFS  (ASETNBODY  FM)))) 

043 

(AFFD  :«  (NOOENAFFO  (ASETNBODY  FM))) 

044 

(PEFFS  :*  (EFFS-UNION  ASETEFFS 

045 

(NOOENPEFFS  (ASETNBODY  FM)))) 

046 

(PAFFO  :•  (NOOENPAFFD  ( ASET\BOOY  FM)))))) 

04  7 

(CATCH 

| 048 

(EFFS-ANALYZE  (CATCHNBOOY  FM)  REDO) 

j 049 

(ALTER-NOOE  NOOE 

1 050 

(EFFS  :«  (NODENEFFS  (CATCHNBOOY  FM))) 

051 

(AFFD  :«  (NOOENAFFO  (CATCHNBOOY  FM))) 

052 

(PEFFS  :»  (NOOENPEFFS  (CATCHNBOOY  FM))) 

| 053 

(PAFFO  :«  ( NODE NP AFFD  (CATCHNBOOY  FM))))) 

054 

(LABELS 

, 055 

( AMAPC  (LAMBOA  (F)  (EFFS-ANALYZE  F REDO)) 

f 056 

(LABELSNFNDEFS  FM)) 

057 

(EFFS-ANALYZE  (LABELSNBODY  FM)  REOO) 

0 58 

(ALTER-NOOE  NODE 

059 

(EFFS  :«  (EFFS-UNION  '(CONS) 

060 

(NOOENEFFS  ( LABELSNBOOY  FM)))) 

061 

(AFFO  :•  (NOOENAFFO  (LABELSNBOOY  FM))) 

062 

(PEFFS  :«  (EFFS-UNION  '(CONS) 

063 

(NOOENPEFFS  (LABELSNBOOY  FM)))) 

064 

(PAFFO  :■  (NOOENPAFFD  (LABELSNBODY  FM))))) 

065 

(COMBINATION 

066 

( EF F S -ANALYZE -COMB INAT ION  NODE  FM  REDO))))))) 
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(While  it  may  be  held  that  allowing  ASET'  on  variables  is  unclean,  and 
that  the  use  of  cells  as  in  PLASMA  is  semantically  neater,  it  is  true  that 
because  of  the  lexical  scoping  rules  it  can  always  be  determined  whether  a given 
variable  is  ever  used  in  an  ASET'.  In  this  way  one  can  say  that  variables  are 
divided  by  the  compiler  into  two  classes:  those  which  are  implicitly  cells,  and 
those  which  are  not.) 

A closure  ( LAMBDA-expression ) causes  a CONS  side-effect.  This  is  not  so 
much  because  SCHEME  programs  depend  on  being  able  to  do  EQ  on  closures  (it  is 
unclear  whether  this  is  a reasonable  thing  to  specify  in  the  semantics  of 
SCHEME),  as  because  there  is  no  point  in  creating  two  closures  when  one  will 
suffice.  Adjoining  CONS  to  EFFS  will  prevent  the  creation  of  such  duplicate  code 
by  the  optimizer.  The  same  idea  holds  for  LABELS  (which  has  LAMBDA-expressions 
within  it). 

Notice  that  a LAMBDA  node  does  not  add  to  its  four  sets  the  information 
from  its  body's  sets.  This  is  because  evaluation  of  a LAMBDA-expression  does  not 
immediately  evaluate  the  body.  Only  later,  when  the  resulting  closure  is 
invoked,  is  the  body  executed. 


EFFS-UNION  gives  the  union  of  two  sets  of  side  effects.  It  knows  about 
the  special  symbols  NONE  and  ANY. 

F.FFS- ANALYZE  - IF  computes  the  side-effect  sets  for  IF  nodes.  It  has  been 
made  a separate  function  only  because  its  code  is  so  bulky;  it  must  perform  a 
three-way  union  for  each  of  four  sets. 

EFFS -ANALYZE -COMB  I NAT  ION  computes  the  side-effect  sets  for  COMBINATION 
nodes.  First  the  function  is  analyzed,  then  the  arguments.  The  unions  of  the 
four  sets  over  all  the  arguments  are  accumulated  in  EF,  AF,  PEF,  and  PAF.  CHECK- 
COMB INATION-PEFFS  is  called  to  warn  the  user  of  any  possible  violations  of  the 
rule  that  SCHEME  is  privileged  to  choose  the  order  in  which  to  evaluate  the 
subforms  of  a combination.  Finally,  there  are  three  cases  depending  on  the  form 
of  the  function  position. 

If  it  is  a variable,  then  the  property  list  of  the  variable  name  is 
searched  for  information  about  that  function.  (The  generated  names  for  local 
variables  will  never  have  any  such  information;  thus  information  will  be  found 
only  for  global  variables.  This  information  is  used  to  augment  the  sets.  (A 
clever  technique  not  used  in  RABBIT  would  be  to  arrange  for  situations  like 
((LAMBDA  (F)  <bodyl>)  (LAMBDA  (...)  <body2>),  where  F denotes  a "known  function" 
(see  the  description  of  BIND-ANALYZE  below),  to  put  on  the  property  list  of  F the 
side-effect  information  for  <body2>,  to  aid  optimization  in  <bodyl>.) 

If  the  function  position  is  a LAMBDA-expression,  then  the  four  sets  of 
the  body  of  the  LAMBDA-expression  are  unioned  into  the  four  sets  for  the 
COMBINATION  node.  This  is  because  in  this  case  we  know  that  the  body  LAMBDA- 
expression  will  be  executed  in  the  course  of  executing  the  COMBINATION  node. 

In  any  other  case,  an  unknown  function  is  computed,  and  so  it  must  be 
assumed  that  any  side-effect  is  possible  for  EFFS  and  AFFD. 
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( OEF 1NE  EFFS -UN ION 

(LAMBDA  (A  B) 

(COND  ( ( EO  A 'NONE)  B) 
((EO  B 'NONE)  A) 
((EO  A 'ANY)  'ANY) 
((EO  B 'ANY)  'ANY) 
(T  (UNION  A B))))) 
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(DEF 1NE  EF  F S - ANAL YZE - IF 

(LAMBDA  ( NOOE  FM  REDO) 

(BLOCK  (EFFS-ANALYZE  ( IF \PRE0  FM)  REDO) 

(EFFS-ANALYZE  (IFSCON  FM)  REDO) 

(EFFS-ANALYZE  ( 1F\ALT  FM)  REDO) 

(ALTER-NOOE  NODE 

(EFFS  :<=  (EFFS-UN10N  (NOOESEFFS  (!F\PRED  FM)) 

(EFFS-UNION  (NOOESEFFS  (IF\CON  FM)) 

( NOOESEFFS  ( IF\ALT  FM))))) 

( AFFD  :*  (EFFS-UNION  (NODESAFFD  ( IF\PRED  FM)) 

(EFFS-UNION  ( NODE  \ AFFD  ( IF  SCON  FM)) 

(NOOESAFFO  ( IF  SALT  FM))))) 

(PEFFS  :=  (EFFS-UNION  (NODESPEFFS  (IFSPRED  FM)) 

(EFFS-UNION  (NODESPEFFS  (1FSCON  FM)) 

(NODESPEFFS  ( IF  SALT  FM))))) 

( PAFFD  :*  (EFFS-UNION  (NODESPAFFO  (IFSPRED  FM)) 

(EFFS-UNION  (NODESPAFFO  (IFSCON  FM)) 

(NODESPAFFO  ( IF  SALT  FM))))))))) 


(SET'  ‘CHECK- PEF F S*  NIL) 


(DEFINE  EFFS-ANALYZE -COMB  I NAT  ION 
(LAMBDA  (NODE  FM  REDO) 

(LET  ((ARGS  ( COMB INAT IONSARGS  FM))) 

(EFFS-ANALYZE  (CAR  ARGS)  REDO) 

(DO  ((A  ( CDR  ARGS)  (COR  A)) 

(EF  'NONE  (EFFS-UNION  EF  (NOOESEFFS  (CAR  A)))) 

(AF  'NONE  (EFFS-UNION  AF  (NODESAFFD  (CAR  A)))) 

( PEF  ’NONE  (EFFS-UNION  PEF  (NOOESPEFFS  (CAR  A)))) 

( PAF  'NONE  (EFFS-UNION  PAF  (NODESPAFFO  (CAR  A))))) 

((NULL  A) 

(IF  ‘CHECK-PE FFS«  (CHECK -COMB  I NAT  ION -PEFFS  FM)) 

(COND  ((EO  (TYPE  (NODESFORM  (CAR  ARGS)))  ’VARIABLE) 

(LET  ((V  (VARIABLESVAR  (NODESFORM  (CAR  ARGS))))) 

(LET  ( ( VE  (GET  V ' FN-SIOE-EFFECTS) ) 

(VA  (GET  V ' FN-SIDE-AFFECTEO ) ) ) 

(ALTER-NOOE  NOOE 

(EFFS  :*  (IF  VE  (EFFS-UNION  EF  VE ) 'ANY)) 
(AFFD  :«  (IF  VA  (EFFS-UNION  AF  VA)  ’ANY)) 
(PEFFS  :•  (EFFS-UNION  PEF  VE ) ) 

(PAFFD  :*  (EFFS-UNION  PAF  VA)))))) 

((EO  (TYPE  (NOOESFORM  (CAR  ARGS)))  ’LAMBDA) 

(LET  ((B  (LAMBDASBODY  (NODESFORM  (CAR  ARGS))))) 

(ALTER-NODE  NOOE 

(EFFS  :=  (EFFS-UNION  EF  (NOOESEFFS  B))) 

(AFFD  :>  (EFFS-UNION  AF  (NODESAFFD  8))) 

(PEFFS  :*  (EFFS-UNION  PEF  (NODESPEFFS  B))) 

(PAFFD  :•  (EFFS-UNION  PAF  (NODESPAFFO  B ) ) ) ) ) ) 

(T  (ALTER-NOOE  NODE 

(EFFS  :*  ’ANY) 

(AFFD  :•  ’ANY) 

(PEFFS  :*  (EFFS-UNION  PEF 

(NODESPEFFS  (CAR  ARGS)))) 
(PAFFD  :•  (EFFS-UNION  PAF 

(NODESPAFFO  (CAR  ARGS)))))))) 


(EFFS-ANALYZE  (CAR  A)  REDO))))) 
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CIIECK-COMBINATION-PEFFS  checks  all  the  argument  forms  of  a combination 
(including  the  function  position)  to  see  if  they  are  all  independent  of  each 
other  with  respect  to  side  effects.  If  not,  a warning  is  issued.  This  is 
because  the  semantics  of  SCHEME  specify  that  the  arguments  may  be  evaluated  in 
any  order,  and  the  user  may  not  depend  on  a particular  ordering. 

The  test  is  made  by  comparing  all  pairs  of  arguments  within  the 
combination.  If  the  side-effects  of  one  can  "provably"  affect  the  evaluation  of 
the  other,  or  if  they  both  cause  a side  effect  of  the  same  category  (other  than 
CONS,  which  is  special),  then  the  results  may  depend  on  which  order  they  are 
evaluated  in.  The  test  is  not  completely  rigorous,  and  may  err  in  either 
direction,  but  "probably"  a reasonably  written  SCHEME  program  will  satisfy  the 
test . 


This  check  is  controlled  by  the  switch  *CHECK-PEFFS*  in  EFFS-ANALYZE- 
COMBINATION.  This  switch  is  provided  because  empirical  tests  show  that 
performing  the  test  slows  down  compilation  by  twenty  to  thirty  percent.  The  test 
has  proved  valuable  in  trapping  programming  errors,  and  so  is  normally  on,  but  it 
can  be  turned  off  for  speed  in  compiling  programs  in  which  one  has  confidence. 


EFFDEF  is  a macro  which  expands  into  a number  of  DEFPROP  forms.  This  is 
used  to  define  side-effect  information  about  primitive  functions.  For  example: 

(EFFDEF  CADR  NONE  ( RPLACA  RPLACD) ) 

states  that  CADR  causes  no  side-effects,  and  is  "provably"  affected  by  the  RPLACA 
and  RPLACD  categories  of  side-effects.  Similarly: 

(EFFDEF  MEMO  NONE  (RPLACA  RPLACD)  T) 

specifies  the  same  information  for  MEMQ,  but  the  "T"  means  that  a call  to  MEMQ 
with  constant  arguments  may  be  "folded"  (evaluated,  and  the  result  compiled 
instead),  despite  the  fact  that  some  side  effects  can  affect  it.  This  represents 
a judgement  that  it  is  extremely  unlikely  that  someone  will  write  a program  which 
modifies  a constant  argument  to  be  given  to  MEMQ. 
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(DEFINE  CHECK-COMB INAT ION-PEFFS 
(LAMBOA  (FM) 

(IF  (NOT  ( COMB  INAT  IONWARNP  FM)) 

(DO  ((A  ( COMB INAT ION\ARGS  FM)  (COR  A))) 

((NULL  A)) 

(00  (<B  (COR  A)  (COR  B))) 

((NULL  B)) 

(IF  (NOT  (EFFECTLESS  (EFFS- INTERSECT  (NOOENPEFFS  (CAR  A)) 

(NODENPAFFD  (CAR  B))))) 

(BLOCK  (WARN  |co-argument  may  affect  later  one| 

(NOOENSEXPR  (CAR  A)) 

"(EFFECTS  * ,(NODE\PEFFS  (CAR  A))) 

(NOOE\SEXPR  (CAR  B)) 

"(AFFECTED  BY  ,(NOOE\PAFFO  (CAR  B)))) 

( AL  TER -COMB  I NAT  ION  FM  { WARNP  :=  T)))) 

(IF  (NOT  (EFFECTLESS  (EFFS- INTERSECT  (NOOENPEFFS  (CAR  B)) 

( NODE \PAFFD  (CAR  A))))) 

(BLOCK  (WARN  |co-argument  may  affect  earlier  one  | 

( NOOENSEXPR  (CAR  B)) 

"(EFFECTS  ■=  .(NOOENPEFFS  (CAR  B))) 

( NOOENSEXPR  (CAR  A)) 

"(AFFECTED  BY  .(NOOENPAFFO  (CAR  A)))) 

( ALTER -COMB INAT ION  FM  (WARNP  :«  T)))) 

(IF  (NOT  (EFFECTLESS-EXCEPT-CONS  ( EFFS- INTERSECT  (NOOENPEFFS  (CAR  A)) 

(NOOENPEFFS  (CAR  B))))) 

(BLOCK  (WARN  |co- arguments  may  have  interfering  effectsl 
(NOOENSEXPR  (CAR  A)) 

"(EFFECTS  ■=  .(NOOENPEFFS  (CAR  A))) 

(NOOENSEXPR  (CAR  B)) 

"(EFFECTS  - .(NOOENPEFFS  (CAR  B)))) 

( ALTER -COMB  I NAT  ION  FM  (WARNP  :«  T))))))))) 


( DEFMAC  EFFDEF  (FN  EFFS  AFFD  . FOLD) 

“( PROGN  ( DEFPROP  .FN  .EFFS  FN-SIOE -EFFECTS ) 

( OEFPROP  .FN  .AFFD  FN-S IDE- AFFECTED ) 

, ( ANO  FOLD  "(OEFPROP  .FN  T OKAY-TO-FOLO ) ) ) ) 

(DECLARE  (/BOEFINE  EFFOEF  |SIDE  EFFECTSI)) 


i 
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This  page  contains  declarations  of  side-effect  information  for  many 
standard  primitive  functions.  The  EFFDEF  macro  used  to  make  the  declarations  is 
described  on  the  previous  page. 


001 

002 

( PROGN  ’COMPILE 

003 

(EFFDEF 

+ NONE  NONE) 

004 

( EFFDEF 

- NONE  NONE) 

005 

(EFFDEF 

* NONE  NONE) 

006 

(EFFDEF 

//  NONE  NONE) 

007 

(EFFDEF 

■=  NONE  NONE  ) 

008 

(EFFDEF 

< NONE  NONE) 

009 

(EFFDEF 

> NONE  NONE) 

010 

(EFFDEF 

CAR  NONE  (RPLACA)) 

on 

(EFFDEF 

CDR  NONE  ( RPLACO ) ) 

012 

(EFFDEF 

CAAR  NONE  (RPLACA)) 

013 

(EFFDEF 

CADR  NONE  (RPLACA  RPLACO)) 

014 

(EFFDEF 

CDAR  NONE  (RPLACA  RPLACO)) 

015 

(EFFDEF 

CODR  NONE  (RPLACO)) 

016 

(EFFDEF 

CAAAR  NONE  (RPLACA)) 

017 

(EFFDEF 

CAADR  NONE  (RPLACA  RPLACO)) 

018 

( EFFDEF 

CAOAR  NONE  (RPLACA  RPLACO)) 

019 

(EFFDEF 

CADDR  NONE  (RPLACA  RPLACD ) ) 

020 

(EFFDEF 

CDAAR  NONE  (RPLACA  RPLACD)) 

021 

(EFFDEF 

COADR  NONE  (RPLACA  RPLACO)) 

022 

(EFFDEF 

CODAR  NONE  (RPLACA  RPLACD)) 

023 

(EFFDEF 

CDODR  NONE  (RPLACO)) 

024 

(EFFDEF 

CAAAAR  NONE  (RPLACA)) 

025 

(EFFDEF 

CAAADR  NONE  (RPLACA  RPLACD)) 

026 

(EFFDEF 

CAADAR  NONE  (RPLACA  RPLACD)) 

027 

(EFFDEF 

CAADOR  NONE  (RPLACA  RPLACO)) 

028 

(EFFDEF 

CADAAR  NONE  (RPLACA  RPLACD)) 

029 

(EFFDEF 

CADADR  NONE  (RPLACA  RPLACD)) 

030 

(EFFDEF 

CADDAR  NONE  (RPLACA  RPLACO)) 

031 

(EFFDEF 

CADDDR  NONE  (RPLACA  RPLACO)) 

032 

(EFFDEF 

CDAAAR  NONE  (RPLACA  RPLACO)) 

033 

(EFFDEF 

CDAADR  NONE  (RPLACA  RPLACO)) 

0 34 

(EFFDEF 

CDAOAR  NONE  (RPLACA  RPLACD)) 

035 

(EFFDEF 

CDAODR  NONE  (RPLACA  RPLACO)) 

036 

(EFFDEF 

CDDAAR  NONE  (RPLACA  RPLACO)) 

037 

(EFFDEF 

CODADR  NONE  (RPLACA  RPLACD)) 

038 

(EFFDEF 

COOOAR  NONE  (RPLACA  RPLACD)) 

039 

(EFFOEF 

CDDODR  NONE  (RPLACO)) 

040 

(EFFDEF 

CXR  NONE  (RPLACA  RPLACO)) 

041 

(EFFDEF 

RPLACA  (RPLACA)  NONE) 

042 

(EFFOEF 

RPLACD  (RPLACA)  NONE) 

043 

(EFFDEF 

RPLACX  (RPLACA  RPLACO)  NONE) 

044 

(EFFDEF 

EQ  NONE  NONE) 

045 

(EFFOEF 

ATOM  NONE  NONE) 

046 

(EFFDEF 

NUMBERP  NONE  NONE) 

047 

(EFFOEF 

TYPEP  NONE  NONE) 

048 

(EFFDEF 

SYMBOLP  NONE  NONE) 

049 

(EFFDEF 

HUNKP  NONE  NONE) 

050 

(EFFDEF 

FIXP  NONE  NONE) 

051 

(EFFOEF 

FLOATP  NONE  NONE) 

052 

(EFFOEF 

BIGP  NONE  NONE) 

053 

(EFFDEF 

NOT  NONE  NONE) 

054 

(EFFDEF 

NULL  NONE  NONE) 

055 

(EFFDEF 

CONS  (CONS)  NONE) 

056 

(EFFOEF 

LIST  (CONS)  NONE) 

057 

(EFFOEF 

APPEND  (CONS)  (RPLACD)) 

056 

(EFFDEF 

MEMO  NONE  (RPLACA  RPLACO)  T) 

059 

(EFFDEF 

ASSO  NONE  (RPLACA  RPLACO)  T) 

060 

(EFFDEF 

PRINT  (FILE)  (FILE  RPLACA  RPLACO)) 

061 

(EFFDEF 

PR  INI  (FILE)  (FILE  RPLACA  RPLACO)) 

062 

(EFFDEF 

PRINC  (FILE)  (FILE  RPLACA  RPLACD)) 

063 

(EFFDEF 

TERPRI  (FILE)  (FILE)) 

064 

(EFFDEF 

TYO  (FILE)  (FILE)) 

065 

(EFFDEF 

READ  ANY  (FILE)) 

066 

(EFFOEF 

TYI  ANY  (FILE)) 

067 

' SIOE-EFFECTS-PROPERTIES ) 
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ERASE-NODE  and  ERASE-ALL-NODES  are  convenient  mnemonic  macros  used  to 

invoke  ERASE-NODES. 

ERASE-NODES  is  used  by  the  optimizer  to  destroy  nodes  which  have  been 
removed  from  the  program  tree  because  of  some  optimization.  If  ALLP  is  NIL 

(ERASE-NODE),  then  only  the  given  node  is  erased;  if  it  is  T (ERASE-ALL-NODES), 
then  the  given  node  and  all  descendants,  direct  and  indirect,  are  erased. 

Erasing  a node  may  involve  removing  certain  properties  from  property 
lists.  This  is  necessary  to  maintain  the  consistency  of  the  properties.  For 

example,  if  a VARIABLE  node  is  erased,  then  that  node  mist  be  removed  from  the 

READ-REFS  property  of  the  variable  name.  The  optimizer  depends  on  this  so  that, 
for  example,  it  can  determine  whether  all  references  to  a variable  have  been 
erased . 


It  should  be  noted  in  passing  that  in  principle  all  occurrences  of  ASET 
on  a given  variable  could  be  erased,  thereby  reducing  its  WRITE-REFS  property  to 
NIL.  Because  the  EFFS-ANALYZE  computation  on  VARIABLE  nodes  used  the  WRITE-REFS 
property,  a VARIABLE  node  might  have  ASET  in  its  AFFD  set  after  the  optimizer  had 
removed  all  the  ASET  nodes.  Because  of  the  tree-walking  discipline  of  the 
optimizer,  the  VARIABLE  nodes  will  not  be  reanalyzed  immediately.  This  cannot 
hurt,  however;  it  may  just  cause  the  optimizer  later  to  be  more  cautious  than 
necessary  when  examining  a VARIABLE  node.  (If  this  doesn't  make  sense,  come  back 
after  reading  the  description  of  the  optimizer.) 

The  flag  *TESTING*  is  used  to  determine  whether  or  not  to  remove  the  node 
from  the  NODE  property  on  the  node's  name.  When  debugging,  it  is  very  useful  to 
keep  this  information  around  to  trace  the  optimizer's  actions;  but  when 
compiling  a large  function  for  "production"  purposes,  the  discarded  nodes  may 
bloat  memory,  and  so  they  must  be  removed  from  the  NODE  property  in  order  that 
they  may  be  garbage-collected  by  LISP. 
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;;;  THIS  ROUTINE  IS  USED  TO  UNDO  ANY  PASS  1 ANALYSIS  ON  A NOOE . 

(OEFMAC  ERASE -NOOE  (NODE)  "( ERASE -NODES  .NODE  NIL)) 

(OEFMAC  ERASE -ALL -NOOES  (NOOE)  "(ERASE-NODES  .NODE  T)) 

(DEFINE  ERASE-NODES 

(LAMBDA  (NOOE  ALLP ) 

(LET  ((FM  (NODEVORM  NODE))) 

(OR  (EQ  (TYPE  NOOE)  'NOOE) 

(ERROR  1 |Cannot  erase  a non-node|  NODE  'FAIL-ACT)) 

(EOCASE  (TYPE  FM) 

(CONSTANT) 

(VARIABLE 

( OELPROP  (VARIABLENVAR  FM)  NODE  ’READ-REFS)) 

(LAMBDA 

(IF  ALLP  (ERASE-ALL-NODES  ( L AMBDA\BODY  FM))) 

(IF  (NOT  ‘TESTING* ) 

( AMAPC  (LAMBDA  (V)  (REMPROP  V 'BINDING))  (LAMBDAWARS  FM)))) 
(IF  (COND  (ALLP  (ERASE-ALL-NODES  { IF\PRED  FM)) 

(ERASE-ALL-NODES  (IFNCON  FM)) 

(ERASE-ALL-NOOES  ( IF\ALT  FM))))) 

(ASET 

(IF  ALLP  (ERASE-ALL-NOOES  (ASET\BOOY  FM))) 

(OELPROP  (ASETWAR  FM)  NOOE  ' WR I TE -REFS ) ) 

(CATCH 

(IF  ALLP  (ERASE-ALL-NODES  (CATCH\BOOY  FM))) 

(IF  (NOT  ‘TESTING* ) 

(REMPROP  (CATCHWAR  FM)  'BINDING))) 

(LABELS 

(COND  (ALLP  (AMAPC  (LAMBDA  (0)  (ERASE -ALL -NODES  D)) 

( LABELSNFNDEFS  FM)) 

(ERASE-ALL-NOOES  (LABELS\BOOY  FM)))) 

( IF  (NOT  ‘TESTING*) 

(AMAPC  (LAMBDA  (V)  (REMPROP  V ’BINDING) ) ( LABELSVF NVARS  FM)))) 
(COMBINATION 

(IF  ALLP  (AMAPC  (LAMBDA  (A)  ( ERASE -ALL -NODES  A)) 

( COMB  I NAT !ON\ARGS  FM))))) 

(IF  (NOT  ‘TESTING*) 

(REMPROP  (NOOEXNAME  NOOE)  'NODE))))) 
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META- EVALUATE  is  the  top-level  function  of  the  optimizer.  It  accepts  a 
node,  and  returns  a node  (not  necessarily  the  same  one)  for  an  equivalent 
program. 


The  METAP  flags  in  the  nodes  are  used  to  control  re-analysis.  META- 
EVALUATE  checks  this  flag  first  thing,  and  returns  the  given  node  immediately  if 
its  METAP  flag  is  non-NIL,  meaning  the  node  has  already  been  properly  optimized. 
Otherwise  it  examines  the  node  more  carefully. 

Some  rules  about  the  organization  of  the  optimizer: 

[ 1 J A node  returned  by  a call  to  META-EVALUATE  will  always  have  its  METAP  flag 

set . 

[2]  The  descendants  of  a node  must  be  meta-evaluated  before  any  information  in 
them  is  used. 

[3]  If  a node  has  its  METAP  flag  set,  so  do  all  of  its  descendants.  Moreover, 
REANALYZE  1 has  been  applied  to  the  node,  so  all  of  the  information  filled  in  by 
pass-1  analysis  ( ENV-ANALYZE,  TRIV-ANALYZE,  and  EFFS-ANALYZE)  is  up-to-date. 

When  COMPILE  calls  META-EVALUATE,  all  the  METAP  flags  are  NIL,  and  no 
pass-1  analysis  has  been  performed.  META-EVALUATE,  roughly  speaking,  calls 
itself  recursively,  and  meta-evaluates  the  node  tree  from  the  bottom  up.  After 
meta-evaluating  all  the  descendants  of  a node,  it  applies  REANALYZE1  to  perform 
pass-1  analysis  on  that  node,  sets  the  METAP  flag,  and  returns  the  node. 
Exceptions  can  be  made  to  this  discipline  if  a non-trivial  optimization  occurs. 

If  the  (meta-evaluated)  predicate  part  of  an  IF  node  is  itself  an  IF  node 
(and  the  debugging  switch  *FUDGE*  is  set),  then  META-IF-FUDGE  is  called.  If  it 
is  a constant,  then  the  value  of  the  constant  is  used  to  select  either  the 
consequent  CON  or  the  alternative  ALT.  The  other  one  is  then  erased,  and  the  IF 
node  is  itself  erased.  The  selected  component  node  is  then  returned  (it  has 
already  been  meta-evaluated).  The  statistics  counter  *DEAD-COUNT*  counts 
occurrences  of  this  "dead  code  elimination"  optimization. 

The  other  two  interesting  cases  are  COMBINATION  nodes  whose  function 
position  contains  either  a trivial  function  or  a LAMBDA  node.  META-COMBINATION- 
TRIVFN  and  META-COMBINATION-LAMBDA  handle  these  respective  cases. 
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THE  VALUE  OF  META-EVALUATE  IS  THE  (POSSIBLY  NEW)  NODE  RESULTING  FROM  THE  GIVEN  ONE. 

(SET1  ‘FUDGE*  T)  ;SWITCH  TO  CONTROL  META- IF -FUDGE 

(SET*  ‘DEAD-COUNT*  0)  ;COUNT  OF  DEAD-COOE  ELIMINATIONS 

(DEFINE  META-EVALUATE 
(LAMBOA  (NODE) 

(IF  (NOOE \METAP  NODE) 

NOOE 

(LET  ( ( FM  (NODEXFORM  NODE))) 

(EQCASE  (TYPE  FM) 

(CONSTANT 
( REANALYZE  1 NOOE) 

(ALTER-NODE  NODE  (METAP  :•  T))) 

(VARIABLE 
( REANALYZE  1 NODE) 

(ALTER-NOOE  NOOE  (METAP  :»  T)))  ' 

(LAMBDA 

(ALTER-LAMBDA  FM  (BODY  :•  (META-EVALUATE  (LAMBOAXBOOY  FM)))) 

(RE ANALYZE 1 NODE) 

(ALTER-NODE  NODE  (METAP  :«  T))) 

(IF  , 

( ALTER-IF  FM  < 

(PREO  :•  (META-EVALUATE  ( IF\PRED  FM)))  1 

(CON  :*  (META-EVALUATE  (IF\CON  FM))) 

(ALT  :«  (META-EVALUATE  ( IF\ALT  FM))))  I 

(IF  (AND  ‘FUDGE*  ( EQ  (TYPE  (NODE\FORM  ( IFXPRED  FM)))  'IF))  1 

(META- IF -FUDGE  NODE)  J 

(IF  (EO  (TYPE  (NODEXFORM  ( IFXPRED  FM)))  ’CONSTANT)  J 

(LET  ((CON  (IFXCON  FM)) 

(ALT  (IFXALT  FM)) 

(VAL  (CONSTANTXVALUE  (NODEXFORM  (IFXPRED  FM))))) 

(ERASE -NODE  NODE) 

(ERASE- ALL -NOOES  (IFXPREO  FM)) 

(INCREMENT  *DEAD-COUNT* ) ' 

(IF  VAL 

(BLOCK  (ERASE -ALL -NOOES  ALT)  CON) 

(BLOCK  (ERASE-ALL-NOOES  CON)  ALT))) 

(BLOCK  ( REANALYZE 1 NODE) 

(ALTER-NOOE  NODE  (METAP  :■  T)))))) 

(ASET 

(ALTER-ASET  FM  (BOOY  :•  (META-EVALUATE  ( ASETXBOOY  FM)))) 

( REANALYZE  1 NODE) 

(ALTER-NOOE  NOOE  (METAP  :•  T))) 

(CATCH 

(ALTER-CATCH  FM  (BOOY  :*  (META-EVALUATE  (CATCHXBODY  FM)))) 

( REANALYZE  1 NOOE) 

(ALTER-NOOE  NODE  (METAP  :■  T))) 

(LABELS  I 

(DO  ((D  (LABELSXFNDEFS  FM)  (CDR  0))) 

((NULL  0)) 

( RPLACA  D (META-EVALUATE  (CAR  D)))) 

(ALTER-LABELS  FM  (BODY  :«  (META-EVALUATE  (LABELSXBOOY  FM)))) 

( REANALYZE  1 NODE) 

(ALTER-NOOE  NODE  (METAP  :•  T))) 

(COMBINATION 

(LET  ((FN  (NODEXFORM  (CAR  ( COMB  I NAT IONXARGS  FM))))) 

(COND  ((AND  (EQ  (TYPE  FN)  ’VARIABLE) 

(TRIVFN  ( VARIABLEXVAR  FN))) 

(META-COMBINATION-TRIVFN  NODE))  , 

((EO  (TYPE  FN)  ’LAMBDA) 

(META-COMBINATION-LAMBDA  NODE)) 

(T  (DO  ((A  (COMB  I NAT  IONXARGS  FM)  (CDR  A))) 

((NULL  A)) 

(RPLACA  A (META-EVALUATE  (CAR  A)))) 

( REANALYZE  1 NODE) 

(ALTER-NOOE  NOOE  (METAP  :*  T))))))))))) 


160 


For  an  IF  nested  within  another  IF,  the  transformation  shown  in  the 
comment  is  performed.  This  involves  constructing  an  S-expression  of  the 
appropriate  form  and  then  calling  ALPHATIZE  to  convert  it  into  a node-tree.  (The 
node-tree  could  be  constructed  directly,  but  it  is  easier  to  call  ALPHATIZE. 
This  is  the  reason  why  ALPHATIZE  merely  returns  a NODE  if  it  encounters  one  in 
the  S-expression;  META-IF-FUDGE  inserts  various  nodes  in  the  S-expression  it 
constructs.)  The  original  two  IF  nodes  are  erased,  a statistics  counter  ★FUDGE- 
COUNT*  is  incremented,  and  the  new  expression  is  meta-evaluated  and  returned  in 
place  of  the  nested  IF  nodes. 

(The  statistics  counter  shows  that  this  optimization  is  performed  with 
modest  frequency,  arising  from  cases  such  as  (IF  (AND 


NETA-CONBINATION-TRIVFN  performs  the  standard  recursive  meta-evaluation 
of  all  the  arguments,  and  then  checks  to  see  whether  the  combination  can  be 
"folded".  This  is  possible  all  the  arguments  are  constants,  and  if  the  function 
has  no  side  effects  and  cannot  be  affected  by  side-effects,  or  has  an  OKAY-TO- 
FOLD  property.  If  this  is  the  case,  the  function  is  applied  to  the  arguments, 
the  combination  node  and  its  descendants  are  erased,  the  statistics  counter 
★FOLD-COUNT*  is  bumped,  and  a new  CONSTANT  node  containing  the  result  is  created 
and  meta-evaluated.  This  might  typically  occur  for  (NOT  NIL)  =>  T,  or  (+  3 4)  => 
7,  or  ( NEMQ  'BAR  '(FOO  BAR  BAZ))  =>  '(BAR  BAZ).  If  this  optimization  is  not 
permissible,  then  the  usual  reanalysis  and  setting  of  the  METAP  flag  is 
performed . 

(The  statistics  counter  shows  that  even  in  a very  large  program  such  as 
RABBIT  this  optimization  is  performed  fewer  than  a dozen  times.  This  may  be  due 
to  my  programming  style,  or  because  there  are  very  few  macros  in  the  code  for 
RABBIT  which  might  expand  into  foldable  code.) 
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;;;  TRANSFORM  (IF  (IF  A B C)  D E)  INTO: 

; ; ; ((LAMBDA  (D1  El) 

;;;  (if  a (if  b (di)  (ei))  (if  c (di)  <ei)))) 

;;;  (LAMBDA  ( ) D) 

; ; ; (LAMBDA  ( ) E)) 

(SET*  *FUDGE -COUNT*  0)  ;COUNT  OF  IF-FUOGES 

(DEFINE  META- IF -FUDGE 
(LAMBDA  (NODE) 

(LET  ((FM  (NOOE\FORM  NOOE ) ) ) 

(LET  ( ( PFM  (NODEXFORM  (IFXPREO  FM)))) 

(LET  ((N  (ALPHATIZE  (LET  ((CONVAR  (GENTEMP  ’META-CON)) 

( ALTVAR  (GENTEMP  ’META-ALT))) 

"((LAMBDA  (.CONVAR  .ALTVAR) 

(IF  ,( IF\PRED  PFM) 

(IF  , ( IF  SCON  PFM) 

( .CONVAR) 

( .ALTVAR)) 

(IF  , ( IF\ALT  PFM) 

( .CONVAR) 

(.ALTVAR)))) 

(LAMBDA  ()  . ( I F XCON  FM)) 

(LAMBDA  ()  ,( IF\ALT  FM)))) 

(NODEXENV  NODE))))  ;DOESN'T  MATTER 

(ERASE-NODE  NOOE) 

(ERASE-NODE  (IF\PRE0  FM)) 

(INCREMENT  *FUDGE -COUNT* ) 

(META-EVALUATE  N)))))) 

;;;  REDUCE  A COMBINATION  WITH  A SIOE-EFFECT-LESS  TRIVIAL 
;;;  FUNCTION  AND  CONSTANT  ARGUMENTS  TO  A CONSTANT. 

(SET’  Of  OLD-COUNT*  0)  ;COUNT  OF  CONSTANT  FOLDINGS 

(DEFINE  META- COMB  I NAT ION-TRI VFN 
(LAMBDA  (NODE) 

(LET  ((FM  (NODEXFORM  NODE))) 

(LET  ((ARGS  ( COMB  I NAT  1 ON\ARGS  FM))) 

(RPLACA  ARGS  (META-EVALUATE  (CAR  ARGS))) 

(DO  ((A  (COR  ARGS)  (CDR  A)) 

(CONSTP  (LET  ((FNNAME  ( VARIABLENVAR  ( NOOE\FORM  (CAR  ARGS))))) 

(OR  (AND  (EQ  (GET  FNNAME 

'FN- SIDE -EFFECTS) 

’NONE) 

(EQ  (GET  FNNAME 

’FN-SIOE- AFFECTED) 

•NONE)) 

(GET  FNNAME  ’ OKAY-TO-FOLD) ) ) 

(AND  CONSTP  (EQ  (TYPE  (NODEXFORM  (CAR  A)))  ’CONSTANT)))) 

((NULL  A) 

(CONO  (CONSTP 

(LET  ((VAL  (APPLY  (VARIABLENVAR  (NODEXFORM  (CAR  ARGS))) 

( AMAPCAR  (LAMBDA  (X) 

(CONST ANTXVALUE 
(NODEXFORM  X))) 

(COR  ARGS))))) 

(ERASE-ALL-NODES  NODE) 

(INCREMENT  *FOLD-COUNT*) 

(META-EVALUATE  (ALPHATIZE  "(QUOTE  .VAL)  NIL)))) 

(T  ( REANALYZE  1 NODE) 

(ALTER -NOOE  NODE  (METAP  :•  T))))) 

(RPLACA  A (META-EVALUATE  (CAR  A)))))))) 
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META -COMB  I NAT ION -LAMBDA  performs  several  interesting  optimizations  on 
combinations  of  the  form  ((LAMBDA  ...)  ...).  It  is  controlled  by  several 
debugging  switches,  and  keeps  several  statistics  counters,  which  we  will  not 
describe  further. 

First  all  the  arguments,  but  not  the  LAMBDA-expression,  are  meta- 
evaluated  by  the  first  DO  loop.  Next,  the  body  of  the  LAMBDA  node  is  meta- 
evaluated  and  kept  in  the  variable  B in  the  second  DO  loop.  This  loop  iterates 
over  the  LAMBDA  variables  and  the  corresponding  arguments.  For  each  variable- 
argument  pair,  8UB5T-CANDIDATE  determines  whether  the  argument  can  "probably"  be 
legally  substituted  for  occurrences  of  the  variable  in  the  body.  If  so,  META- 
SUBSTITUTE is  called  to  attempt  such  substitution.  When  the  loop  finishes,  B has 
the  body  with  all  possible  substitutions  performed.  This  body  is  then  re-meta- 
evaluatnd . (The  reason  for  this  is  explained  later  in  the  discussion  of  META- 
SUBSTITUTE.) 

Next  an  attempt  is  made  to  eliminate  LAMBDA  variables.  A variable  and 
its  corresponding  argument  may  be  eliminated  if  the  variable  has  no  remaining 
references,  and  the  argument  either  has  no  side  effects  or  has  been  successfully 
substituted.  (If  an  argument  has  side  effects,  then  SUBST-CANDIDATE  will  give 
permission  to  attempt  substitution  only  if  no  more  than  one  reference  to  the 
corresponding  variable  exists.  If  the  substitution  fails,  then  the  argument  may 
not  be  eliminated,  because  its  side  effects  must  not  be  lost.  It  the 
substitution  succeeds,  then  the  argument  must  be  eliminated,  because  the  side 
effects  must  not  be  duplicated.)  A consistency  check  ensures  that  in  fact  the 
variable  is  unreferenced  within  the  body  as  determined  by  its  REFS  and  ASETS 
slots;  then  the  argument  and  variable  are  deleted,  and  the  nodes  of  the  argument 
are  erased. 

When  all  possible  variable-argument  pairs  have  been  eliminated,  then 
there  are  two  cases.  If  the  LAMBDA  has  no  variables  left,  then  the  combination 
containing  it  can  be  replaced  by  the  body  of  the  LAMBDA  node.  In  this  case  the 
LAMBDA  and  COMBINATION  nodes  are  erased.  Otherwise  the  LAMBDA  and  COMBINATION 
nodes  are  reanalyzed  and  their  METAP  flags  are  set. 

(The  statistics  counters  show  that  when  RABBIT  compiles  itself  these 
three  optimizations  arc  performed  hundreds  of  times.  This  occurs  because  many 
standard  macros  make  use  of  closures  to  ensure  that  variables  local  to  the  code 
for  the  macro  do  not  conflict  with  user  variables.  These  closures  often  can  be 
substituted  into  the  code  by  the  compiler  and  eliminated.) 


(SET'  *FLUSH-ARGS*  T) 
(SET1  ‘FLUSH-COUNT*  0) 
(SET'  * CON VERT-COUNT*  0) 
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;SWI TCH  TO  CONTROL  VARIABLE  ELIMINATION 
;COUNT  OF  VARIABLES  ELIMINATED 
;COUNT  OF  FULL  BETA-CONVERSIONS 


(DEF INE 

META -COMB  I NAT  ION -LAMBDA 
(LAMBDA  (NODE) 

(LET  ((FM  ( NOOE\FORM  NODE))) 

(LET  ((ARGS  ( COMB INAT ION\ARGS  FM))) 

(DO  ((A  ( CDR  ARGS)  ( CDR  A))) 

((NULL  A)) 

(RPLACA  A (META-EVALUATE  (CAR  A))) 

( ALTER-NOOE  (CAR  A)  (SUBSTP  :*  NIL))) 

(LET  ((FN  ( NOOE\FORM  (CAR  ARGS)))) 

(DO  ((V  (LAMBDAWARS  FN)  (CDR  V)) 

(A  (CDR  ARGS)  (CDR  A)) 

(B  (META-EVALUATE  (LAMBDA\BOOY  FN)) 

(IF  (SUBST-CANDIDATE  (CAR  A)  (CAR  V)  B) 

(META-SUBSTITUTE  (CAR  A)  (CAR  V)  B) 

B))) 

((NULL  V) 

(ALTER-LAMBDA  FN  (BODY  :*  (META-EVALUATE  B))) 

(00  ((V  (LAMBDAWARS  FN)  (CDR  V)) 

(A  (CDR  ARGS)  (CDR  A))) 

((NULL  A)) 

(IF  (AND  ‘FLUSH- ARGS* 

(NULL  (GET  (CAR  V)  ’READ-REFS)) 

(NULL  (GET  (CAR  V)  ' VRI TE -REFS ) ) 

(OR  (EFFECTLESS-EXCEPT-CONS  (NODEXEFFS  (CAR  A))) 
(NOOEXSUBSTP  (CAR  A)))) 

(BLOCK  (IF  (OR  (MEMO  V (NODEXREFS  (LAMBDAXBODY  FN))) 

(MEMO  V (NODEXASETS  (LAMBDAXBODY  FN)))) 

(ERROR  ' | Reanalysis  lost  - META-COMB  INAT  I ON -LAMBDA  | 
NODE 

•FAIL-ACT)) 

(DELO  (CAR  A)  ARGS) 

(ERASE-ALL-NODES  (CAR  A)) 

(INCREMENT  «FLUSH-COUNT* ) 

(ALTER-LAMBOA  FN 

(VARS  :•  (DELO  (CAR  V)  (LAMBDAWARS  FN))) 
(UVARS  :•  (DELO  (GET  (CAR  V)  'USER-NAME) 

( L AMBDAXUVARS  FN))))))) 

(COND  ((NULL  ( LAMBDAWARS  FN)) 

(OR  (NULL  (CDR  ARGS)) 

(ERROR  ' |Too  many  args  In  M£TA-COMBINATION-LAMBDA| 

NOOE 

•FAIL-ACT)) 

(LET  ((BOD  (LAMBDAX600Y  FN))) 

(ERASE-NOOE  (CAR  ARGS)) 

(ERASE -NOOE  NOOE) 

(INCREMENT  *CONVERT-COUNT* ) 

BOD)) 

(T  ( REANALYZE  1 (CAR  ARGS)) 

(ALTER-NOOE  (CAR  ARGS)  (METAP  :•  T)) 

( REANALYZE 1 NOOE) 

(ALTER-NOOE  NOOE  (METAP  :*  T))))))))))) 
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( SUBST-CANDIDATE  ARG  VAR  BOD)  Is  a predicate  which  is  true  iff  it  is 
apparently  legal  to  attempt  to  substitute  the  argument  ARG  for  the  variable  VAR 
in  the  body  1301).  This  predicate  is  very  conservative,  because  there  is  no 
provision  for  backing  out  of  a bad  choice.  The  decision  is  made  on  this  basis: 
[1]  There  must  be  no  ASET  references  to  the  variable.  (This  is  overly 
restrictive,  but  is  complicated  to  check  for  correctly,  and  makes  little 
! difference  in  practice.) 

[Z]  One  of  three  conditions  must  hold: 

[Za]  There  is  at  most  one  reference  to  the  variable.  (Code  with  possible 
side  effects  must  not  be  duplicated.  Exceptions  occur,  for  example,  if  there 
are  two  references,  one  in  each  branch  of  an  IF,  so  that  only  one  can  be 
executed.  This  is  hard  to  detect,  and  relaxing  this  restriction  is  probably 
not  worthwhile. ) 

[Zb]  The  argument  is  a constant  or  variable.  (This  is  always  safe  because 
the  cost  of  a constant  or  variable  is  no  worse  than  the  cost  of  referencing 
the  variable  it  replaces.) 

[ Zc  ] The  argument  is  a LAMBDA-expression,  and  either: 

[Zcl]  There  is  no  more  than  one  reference.  (This  is  tested  again  because 
of  the  presence  of  debugging  switches  in  SUBST-CANDIDATE  which  can  control 
various  tests  independently  to  help  localize  bugs.) 

[ZcZ]  The  body  of  the  LAMBDA-expression  is  a combination,  all  of  whose 
descendants  are  constants  or  variables,  and  the  number  of  arguments  of  the 
combination  (not  counting  the  function)  does  not  exceed  the  number  of 
arguments  taken  by  the  LAMBDA-expression.  (The  idea  here  is  that 
substitution  of  the  LAMBDA-expression  into  function  position  of  some 
combination  will  later  allow  reduction  to  a combination  which  is  no  worse 
than  the  original  one.  This  test  is  a poor  heuristic  if  references  to  the 
variable  VAR  occur  in  other  than  function  position  within  BOD,  because  then 
several  closures  will  be  made  instead  of  one,  but  is  very  good  for  code 
typically  produced  by  the  expansion  of  macros.  In  retrospect,  perhaps  ENV- 
ANALYZE  should  maintain  a third  property  besides  READ-REFS  and  WRITE-REFS 
called,  say,  NON-FN-REFS.  This  would  be  the  subset  of  READ-REFS  which 
occur  in  other  than  function  position  of  a combination.  SUBST-CANDIDATE 
could  then  use  this  information.  Alternatively,  META-SUBSTITUTE  could,  as 
it  walked  the  node-tree  of  the  body,  keep  track  of  whether  a variable  was 
encountered  in  function  position,  and  refuse  to  substitute  a LAMBDA- 
expression  for  a variable  not  in  such  a position  which  had  more  than  one 
reference.  This  might  in  turn  prevent  other  optimizations,  however.) 
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(SET’  ‘SUBST  I TUTE*  T)  ;SVITCH  TO  CONTROL  SUBSTITUTION 

(SET'  *S  INGLE  -SUBST*  T)  ;SW!TCH  TO  CONTROL  SUBSTITUTION  OF  EXPRESSIONS  WITH  SIDE  EFFECTS 

(SET'  *1 AMBOA-SUBST*  T)  ;SVITCH  TO  CONTROL  SUBSTITUTION  OF  LAMBDA-EXPRESSIONS 

(DEFINE  SUBST -C AND  I DATE 

(LAMBDA  (ARG  VAR  BOD) 

(AND  ‘SUBSTITUTE* 

(NOT  (GET  VAR  ' WR I TE -REFS ) ) ; BE  PARANOID  FOR  NOW 

(OR  (AND  *S INGLE -SUBST* 

(NULL  (COR  (GET  VAR  'READ-REFS)))) 

(MEMQ  (TYPE  (NODENFORM  ARG))  '(CONSTANT  VARIABLE)) 

(AND  ‘LAMBDA -SUBST* 

(EO  (TYPE  ( NODENFORM  ARG))  ’LAMBOA) 

(OR  (NULL  (COR  (GET  VAR  'READ-REFS))) 

(LET  ((B  ( NODE \FORM  (LAMBDANBODY  (NODENFORM  ARG))))) 

(OR  (MEMO  (TYPE  B)  '(CONSTANT  VARIABLE)) 

(AND  (EQ  (TYPE  B)  'COMBINATION) 

(NOT  (>  (LENGTH  (COR  (COMB  I NAT  IONNARGS  B))) 

(LENGTH  (LAMBDAWARS  (NODENFORM  ARG))))) 
(DO  ((A  (COMB INAT IONNARGS  B)  (COR  A)) 

(P  T (AND  P (MEMO  (TYPE  (NODENFORM  (CAR  A))) 
'(CONSTANT  VARIABLE))))) 
((NULL  A)  P))>)))))))) 
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REANALYZE  1 calls  PASS1 -ANALYZE  on  the  given  node.  The  argument  T means 
that  optimization  is  in  effect,  and  so  EFFS-ANALYZE  must  be  invoked  after  ENV- 
ANALYZE  and  TRIV-ANALYZE  ( EFFS-ANALYZE  information  is  used  only  by  the 
optimizer).  The  argument  *REANALYZE*  specifies  whether  reanalysis  should  be 
forced  to  all  descendant  nodes,  or  whether  reanalysis  of  the  current  node  will 
suffice.  This  variable  normally  contains  the  symbol  ONCE,  meaning  reanalyze  only 
the  current  node.  META-EVALUATE  normally  ensures,  before  analyzing  a node,  that 
all  descendant  nodes  are  analyzed.  Thus  the  initial  pass-1  analysis  occurs 
incrementally,  interleaved  with  the  meta-evaluation  process. 

The  switch  ^REANALYZE*  may  be  set  to  the  symbol  ALL  to  force  all 
descendants  of  a node  to  be  reanalyzed  before  analyzing  the  node  itself.  This 
ability  is  provided  to  test  for  certain  bugs  in  the  optimizer.  If  the 
incremental  analysis  should  fail  for  some  reason,  then  the  descendant  nodes  may 
not  contain  correct  information  (for  example,  their  information  slots  may  be 
empty!).  The  ALL  setting  ensures  that  a consistent  analysis  is  obtained.  If  the 
optimizer's  behavior  differs  depending  on  whether  ^REANALYZE*  contains  ONCE  or 
ALL,  then  a problem  with  the  incremental  analysis  is  implicated.  This  switch  has 
been  very  useful  for  isolating  such  bugs. 


The  next  group  of  functions  are  utilities  for  META-SUBSTITUTE  which  deal 
with  sets  of  side-effects. 

EFFS- INTERSECT  takes  the  intersection  of  two  sets  of  side-effects.  It  is 
just  like  INTERSECT,  except  that  it  also  knows  about  the  two  special  sets  ANY  and 
NONE. 


EFFECTLESS  is  a predicate  which  is  true  of  an  empty  set  of  side-effects. 

EFFECTLESS-EXCEPT-CONS  is  a predicate  true  of  a set  of  side-effects  which 
is  empty  except  possibly  for  the  CONS  side-effect. 

PASSABLE  takes  a node  and  two  sets  of  side-effects,  which  should  be  the 
EFFS  and  AFFD  sets  from  some  other  node.  PASSABLE  is  a predicate  which  is  true 
if  the  given  node,  which  originally  preceded  the  second  in  the  standard 
evaluation  order,  can  legitimately  be  postponed  until  after  the  second  is 
evaluated.  That  is,  it  is  true  iff  the  first  node  can  "pass"  the  second  during 
the  substitution  process. 


(DEFINE  REANALYZE! 

(LAMBDA  (NODE) 

( PASS 1- ANALYZE  NODE  ‘REANALYZE*  T))) 
(SET'  ‘REANALYZE*  'ONCE) 
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HERE  WE  DETERMINE,  FOR  EACH  VARIABLE  NODE  WHOSE  VAR  IS  THE  ONE 
GIVEN.  WHETHER  IT  IS  POSSIBLE  TO  SUBSTITUTE  IN  FOR  IT;  THIS  IS 
DETERMINED  ON  THE  BASIS  OF  SIDE  EFFECTS.  THIS  IS  DONE  BY 
WALKING  THE  PROGRAM.  STOPPING  WHEN  A SIDE-EFFECT  BLOCKS  IT. 

A SUBSTITUTION  IS  MADE  IFF  IS  VARIABLE  NODE  IS  REACHED  IN  THE  WALK. 

THERE  IS  A BUG  IN  THIS  THEORY  TO  THE  EFFECT  THAT  A CATCH 
WHICH  RETURNS  MULTIPLY  CAN  CAUSE  AN  EXPRESSION  EXTERNAL 
TO  THE  CATCH  TO  BE  EVALUATED  TWICE.  THIS  IS  A DYNAMIC  PROBLEM 
WHICH  CANNOT  BE  RESOLVED  AT  COMPILE  TIME.  AND  SO  WE  SHALL 
IGNORE  IT  FOR  NOW. 


WE  ALSO  RESET  THE  METAP  FLAG  ON  ALL  NOOES  WHICH  HAVE  A 
SUBSTITUTION  AT  OR  BELOW  THEM.  SO  THAT  THE  META -EVALUATOR  WILL 
RE-PENETRATE  TO  SUBSTITUTION  POINTS,  WHICH  MAY  ADMIT  FURTHER 
OPTIMIZATIONS. 


(DEFINE  EFFS- INTERSECT 
(LAMBDA  (A  B) 

(COND  ((EQ  A 'ANY)  B) 

((EQ  B 'ANY)  A) 

((EQ  A 'NONE)  A) 

((EQ  B 'NONE)  B) 

(T  (INTERSECT  A B))))) 

(DEFINE  EFFECTLESS 

(LAMBDA  (X)  (OR  (NULL  X)  (EQ  X 'NONE)))) 

(DEFINE  EFFECTLESS-EXCEPT-CONS 

(LAMBDA  (X)  (OR  (EFFECTLESS  X)  (EQUAL  X '(CO NS))))) 

(DEFINE  PASSABLE 

(LAMBDA  (NODE  EFFS  AFFD) 

(BLOCK  (IF  (EMPTY  (NOOEVEFFS  NOOE)) 

(ERROR  • |Pass  1 Analysis  Missing  - PASSABLE! 

NODE 

•FAIL-ACT)) 

(AND  (EFFECTLESS  (EFFS-INTERSECT  EFFS  (NOOE\AFFD  NODE))) 

(EFFECTLESS  (EFFS-INTERSECT  AFFD  (NOOEVEFFS  NODE))) 
(EFFECTLESS-EXCEPT-CONS  (EFFS-INTERSECT  EFFS  (NOOENEFFS  NODE))))))) 


META-SUBSTITUTE  takes  a node-tree  ARG,  a variable  name  VAR,  and  another 
node-tree  BOO,  and  wherever  possible  substitutes  copies  of  ARG  for  occurrences  of 
VAR  within  BOD.  The  complexity  of  this  process  is  due  almost  entirely  to  the 
necessity  of  determining  the  extent  of  "wherever  possible". 

META-SUBSTITUTE  merely  spreads  out  the  EFFS  and  AFFD  slots  of  ARG  to  make 
them  easy  to  refer  to,  makes  an  error  check,  and  then  passes  the  buck  to  the 
internal  LABELS  routine  SUBSTITUTE,  which  does  the  real  work. 

SUBSTITUTE  recurs  over  the  structure  of  the  node-tree.  At  each  node  it 
first  checks  to  see  whether  VAR  is  in  the  REFS  set  of  that  node.  This  is  purely 
an  efficiency  hack:  if  VAR  is  not  in  the  set,  then  it  cannot  occur  anywhere 
below  that  node  in  the  tree,  and  so  SUBSTITUTE  can  save  itself  the  work  of  a 
complete  recursive  search  of  that  portion  of  the  node-tree. 

SUBSTITUTE  plays  another  efficiency  trick  in  cahoots  with  META-EVALUATE 
to  save  work.  Whenever  SUBSTITUTE  actually  replaces  an  occurrence  of  VAR  with  a 
copy  of  ARG,  the  copy  of  ARG  will  have  its  METAP  flag  turned  off  (set  to  NIL). 
Now  SUBSTITUTE  propagates  the  METAP  flag  back  up  the  node-tree;  when  all  sub- 
nodes of  a node  have  had  SUBSTITUTE  applied  to  them,  then  if  the  METAP  flag  of 
the  current  node  is  still  set,  it  is  set  to  the  AND  of  the  flags  of  the  subnodes. 
Thus  any  node  below  which  a substitution  has  occurred  will  have  its  METAP  flag 
reset.  More  to  the  point,  any  node  which  after  the  substitution  still  has  its 
METAP  flag  set  has  had  no  substitutions  occur  below  it.  META-EVALUATE  can  then 
be  applied  to  BOD  after  all  substitutions  have  been  tried  (this  occurs  in  META- 
COMBINATION-LAMBDA), and  META-EVALUATE  will  only  have  to  re-examine  those  parts 
of  BOD  which  have  changed.  In  particular,  if  no  substitutions  were  successful, 
META-EVALUATE  will  not  have  to  re-examine  BOD  at  all. 

If  the  variable  is  referenced  at  or  below  the  node,  it  breaks  down  into 
cases  according  to  the  type  of  the  node. 

For  a CONSTANT,  no  action  is  necessary. 

For  a VARIABLE,  no  action  is  taken  unless  the  variable  matches  VAR,  in 
which  case  the  node  is  erased  and  a copy  of  ARG  is  made  and  returned  in  its 
place.  The  SUBSTP  slot  of  the  original  ARG  is  set  as  a flag  to  META-COMBINATION- 
LAMBDA  (q.v.),  to  let  it  know  that  at  least  one  substitution  succeeded. 

For  a LAMBDA,  substitution  can  occur  in  the  body  only  if  ARG  has  no  side- 
effects  except  possibly  CONS.  This  is  because  evaluation  of  the  LAMBDA- 
expression  (to  produce  a closure)  will  not  necessarily  cause  evaluation  of  the 
side-effect  in  ARG  at  the  correct  time.  The  special  case  of  a LAMBDA  occurring 
as  the  function  in  a COMBINATION  is  handled  separately  below. 

For  an  IF,  substitution  is  attempted  in  the  predicate.  It  is  attempted 
in  the  other  two  sub-trees  only  if  ARG  can  pass  the  predicate. 

For  an  ASET'  or  a CATCH,  substitution  is  attempted  in  the  body.  The  same 
is  true  of  LABELS,  but  substitution  is  also  attempted  in  the  labelled  function 
def initions . 


001 
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002 

(SET1  «SUBST -COUNT*  0) 

;COUNT  OF  SUBSTITUTIONS 

003 

(SET'  *L AMBOA- BODY-SUBST*  T) 

;SWITCH  TO  CONTROL  SUBSTITUTION  IN  LAMBDA  BODIES 

004 

(SET1  ‘LAMBDA- BOOY- SUBS T- TRY-COUNT*  0)  ;COUNT  THEREOF  - TRIES 

005 

(SET*  *L AMBDA- BODY- SUBST -SUCCESS 

-COUNT*  0)  ;COUNT  THEREOF  - SUCCESSES 

006 

007 

008 

(OEF INE 

009 

META-SUBSTITUTE 

010 

(LAMBDA 

Oil 

(ARG  VAR  BOD) 

012 

(LET  ((EFFS  ( NODE\EFFS  ARG)) 

013 

(AFFO  (NODEXAFFO  ARG))) 

014 

( IF  (EMPTY  EFFS) 

015 

(ERROR  '|Pass  1 Analysis  Screwed  Up  - META-SUBSTITUTE | ARG  'FAIL-ACT)) 

016 

(LABELS 

017 

((SUBSTITUTE 

018 

(LAMBOA  (NODE) 

019 

( IF  (OR  (EMPTY 

(NODEXREFS  NODE)) 

020 

(NOT  (MEMO  VAR  (NODEXREFS  NOOE ) ) ) ) ;EFF ICIENCY  HACK 

021 

NODE 

022 

(LET  ((FM 

(NODEXFORM  NOOE))) 

023 

(EQCASE  (TYPE  FM) 

024 

(CONSTANT  NOOE) 

025 

(VARIABLE 

026 

(IF  (EQ  ( VARIABLEXVAR  FM)  VAR) 

027 

(BLOCK  (ERASE-ALL-NOOES  NOOE) 

028 

(INCREMENT  «SUBST-COUNT« ) 

029 

(ALTER-NOOE  ARG  (SUBSTP  :*  T)) 

030 

(COPY-COOE  ARG)) 

031 

NODE ) ) 

032 

(LAMBDA 

033 

(IF  (AND  (EFFECTLESS-EKCEPT-CONS  EFFS)  (EFFECTLESS  AFFO)) 

034 

(ALTER-LAMBDA  FM  ( BOOY  :«  (SUBSTITUTE  (LAMBDAXBOOY  FM))))) 

035 

(IF  ( NOOE XME TAP  NOOE) 

036 

(ALTER-NOOE  NOOE  (METAP  (NODEXMETAP  ( LAMBDAXBOOY  FM))))) 

037 

NOOE) 

0 38 

(IF 

039 

( ALTER- IF  FM  (PREO  :*  (SUBSTITUTE  (IFXPREO  FM)))) 

040 

(IF  (PASSABLE  (IFXPREO  FM)  EFFS  AFFO) 

041 

(ALTER- IF  FM 

042 

(CON  :*  (SUBSTITUTE  (IFXCON  FM))) 

043 

(ALT  :*  (SUBSTITUTE  ( IF\ALT  FM))))) 

044 

(IF  (NODEXMETAP  NODE) 

045 

(ALTER-NOOE  NODE 

046 

(METAP  :•  (AND  (NODEXMETAP  (IFXPREO  FM)) 

047 

(NOOEXMETAP  (IFXCON  FM)) 

048 

(NOOEXMETAP  ( I EXALT  FM)))))) 

049 

NOOE) 

050 

(ASET 

051 

(ALTER-ASET  FM  (BOOY  :*  (SUBSTITUTE  ( ASETXBODY  FM)))) 

052 

(IF  (NOOEXMETAP  NODE) 

053 

(ALTER-NOOE  NOOE  (METAP  (NOOEXMETAP  (ASETXBODY  FM))))) 

0 54 

NOOE) 

055 

(CATCH 

056 

(ALTER-CATCH  FM  (BOOY  :•  (SUBSTITUTE  (CATCHXBOOY  FM)))) 

057 

(IF  (NOOEXMETAP  NOOE) 

0 58 

(ALTER-NOOE  NOOE  (METAP  :«  (NOOEXMETAP  (CATCHXBOOY  FM))))) 

059 

NOOE) 

060 

(LABELS 

061 

(ALTER-LABELS  FM  (BOOY  :*  (SUBSTITUTE  (LABELSXBOOY  FM)))) 

062 

(00  ((0  (LABELSXFNOEFS  FM)  (COR  0)) 

063 

(MP  (NOOEXMETAP  (LABELSXBOOY  FM)) 

064 

(AND  MP  (NOOEXMETAP  (CAR  0))))) 

065 

((NULL  D) 

068 

(IF  (NODEXMETAP  NODE) 

067 

(ALTER-NOOE  NOOE  (METAP  MP)))) 

068 

(RPLACA  0 (SUBSTITUTE  (CAR  0)))) 

069 

NOOE) 
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The  most  complicated  case  is  the  COMBINATION.  First  it  is  determined  (in 
the  variable  X)  whether  ARG  can  correctly  pass  all  of  the  arguments  of  the 
combination.  (It  is  not  possible  to  substitute  into  any  argument  unless  all  can 
be  passed,  because  at  this  time  it  has  not  been  decided  in  what  order  to  evaluate 
them.  This  decision  is  the  free  choice  of  CONVERT-COMBINATION  below.)  If  it 
can,  then  substitution  is  attempted  in  all  of  the  arguments  except  the  function 
itself.  Then  two  kinds  of  function  are  distinguished.  If  it  is  not  a LAMBDA,  a 
straightforward  recursive  call  to  SUBSTITUTE  is  used.  If  it  is,  then 
substitution  is  attempted  in  the  body  of  the  LAMBDA  (not  in  the  LAMBDA  itself; 
substitution  in  a LAMBDA  requires  that  ARG  be  EFFECTLESS-EXCEPT-CONS,  but  in  this 
special  case  we  know  that  the  LAMBDA-expression  will  be  invoked  immediately,  and 
so  it  is  all  right  if  ARG  has  side-effects). 
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(COMBINATION 

(LET  ((ARGS  (COMB  I NAT IONXARGS  FM))) 

(00  ((A  ARGS  (COR  A)) 

(X  T (AND  X (PASSABLE  (CAR  A)  EFFS  AFFD)))) 

((NULL  A) 

(IF  X (DO  ((A  (COR  ARGS)  (COR  A))) 

((NULL  A)) 

(RPLACA  A (SUBSTITUTE  (CAR  A))))) 

(IF  (AND  ‘LAMBDA -BODY- SUBST* 

(EQ  (TYPE  (NOOEXFORM  (CAR  ARGS)))  ’LAMBDA)) 
(LET  ((FN  ( NOOE \FORM  (CAR  ARGS)))) 

( INCREMENT  ‘LAMBDA-BOOY-SUBST-TRY-COUNT*  ) 
(CONO  (X 

( INCREMENT 

‘LAMBOA-BOOY-SUBST- SUCCESS -COUNT*  ) 
(ALTER-LAMBOA 
FN 

(BODY  :•  (SUBSTITUTE 

( LAMBDA\BODY  FN)))))) 

(IF  ( NOOE \ME TAP  (CAR  ARCS)) 

(ALTER-NOOE 
(CAR  ARGS) 

(METAP  :*  ( NODEXME  TAP 

(LAMBOAXBODY  FN)))))) 

(IF  X (RPLACA  ARGS  (SUBSTITUTE  (CAR  ARGS))))))) 
(DO  ((A  ARGS  (COR  A)) 

(MP  T (AND  HP  (NODEXMETAP  (CAR  A))))) 

((NULL  A) 

(IF  (NODEXMETAP  NOOE) 

(ALTER-NOOE  NODE  (METAP  :*  MP)))))) 


(SUBSTITUTE  BOO))))) 


NODE))))))) 
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COPY-CODE  is  used  by  META-SUBSTITUTE  to  make  copies  of  node-trees 
representing  code.  It  invokes  COPY-NODES  with  appropriate  additional  arguments. 

COPY-NODES  does  the  real  work.  The  argument  ENV  is  analogous  to  the 
argument  ENV  taken  by  ALPHATIZE.  However,  variables  are  not  looked-up  in  ENV  by 
COPY-NODES;  ENV  is  maintained  only  to  install  in  the  new  nodes  for  debugging 
purposes.  The  argument  RNL  is  a "rename  list"  for  variables.  When  a node  is 
copied  which  binds  variables,  new  variables  are  created  for  the  copy.  RNL 
provides  a mapping  from  generated  names  in  the  original  code  to  generated  names 
in  the  copy  (as  opposed  to  ENV,  which  maps  user  names  to  generated  names  in  the 
copy).  Thus,  when  a LAMBDA  node  is  copied,  new  names  are  generated,  and  PAIRLIS 
is  used  to  pair  new  names  with  the  LAMBDA\VARS  of  the  old  node,  adding  the  new 
pairs  to  RNL. 

A neat  trick  to  aid  debugging  is  that  the  new  names  are  generated  by 
using  the  old  names  as  the  arguments  to  GENTEMP.  In  this  way  the  name  of  a 
generated  variable  contains  a history  of  how  it  was  created.  For  example,  VAR- 
34-73-  156  was  created  by  copying  the  LAMBDA  node  which  bound  VAR-34-73,  which  in 
turn  was  copied  from  the  node  which  bound  VAR-34.  Copies  of  CATCH  and  LABELS 
variables  are  generated  in  the  same  way. 

The  large  EQCASE  handles  the  different  types  of  nodes.  The  result  is 
then  given  to  NODIFY,  the  same  routine  which  creates  nodes  for  ALPHATIZE.  Recall 
that  NODIFY  initializes  the  METAP  slot  to  NIL;  the  next  meta-evaluation  which 
comes  along  will  cause  pass-1  analysis  to  be  performed  on  the  new  copies. 

Note  particularly  that  the  WARS  list  of  a LAMBDA  node  is  copied,  for  the 
same  reason  that  ALPHA-LAMBDA  makes  a copy:  META-COMBINATION-LAMBDA  may  alter  it 
destructively. 
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002 

(DEF  INE 

COPY-CODE 

003 

(LAMBDA  (NODE) 

004 

( REANALYZE  1 (COPY-NODES  NOOE  (NOOEXENV  NOOE ) NIL)))) 

005 

006 

(DEF  INE 

007 

COPY-NODES 

000 

(LAMBDA  (NODE  ENV  RNL ) 

009 

(NODIFY 

010 

(LET  ((FM  (NODEXFORM  NODE))) 

on 

(EQCASE 

(TYPE  FM) 

012 

(CONSTANT 

013 

(CONS-CONSTANT  (VALUE  • (CONSTANTXVALUE  FM)))) 

014 

(VARIABLE 

015 

(CONS -VAR I ABLE  (VAR  « (LET  ((SLOT  (ASSQ  (VARIABLEXVAR  FM)  RNL))) 

016 

(IF  SLOT  (CADR  SLOT)  (VARIABLEXVAR  FM)))) 

017 

(GLOBALP  « ( VAR I ABLEXGLOBALP  FM)))) 

018 

(LAMBDA 

019 

(LET  ((VARS  ( AMAPCAR  GENTEMP  (LAMBDAXVARS  FM)))) 

020 

(CONS-LAMBDA  (UVARS  * (APPEND  ( LAMBOAXUVARS  FM)  NIL)) 

021 

(VARS  * VARS) 

022 

(BODY  * (COPY- NODES 

023 

(LAMBDAXBODY  FM) 

024 

(PAIRLIS  (LAMBOAXUVARS  FM)  VARS  ENV) 

025 

(PAIRLIS  (LAMBDAXVARS  FM)  VARS  RNL)))))) 

026 

(IF  (CONS- IF  (PRED  - (COPY-NODES  ( IFXPRED  FM)  ENV  RNL)) 

027 

(CON  • (COPY-NODES  (IFXCON  FM)  ENV  RNL)) 

020 

(ALT  • (COPY-NODES  ( IFXALT  FM)  ENV  RNL)))) 

029 

(ASET 

030 

(CONS-ASET  (VAR  • (LET  ((SLOT  (ASSO  (ASETXVAR  FM)  RNL))) 

031 

(IF  SLOT  (CADR  SLOT)  (ASETXVAR  FM)))) 

032 

(GLOBALP  * ( ASETXGLOBALP  FM)) 

033 

(BOOY  * (COPY-NOOES  ( ASETXBOOY  FM)  ENV  RNL)))) 

034 

(CATCH 

035 

(LET  ((VAR  (GENTEMP  (CATCHXVAR  FM))) 

0 36 

(UVAR  (CATCHXUVAR  FM))) 

037 

(CONS-CATCH  (UVAR  « (CATCHXUVAR  FM)) 

0 38 

(VAR  • VAR) 

039 

(BOOY  « (COPY-NOOES 

040 

(CATCHXBOOY  FM) 

041 

(CONS  (LIST  UVAR  VAR)  ENV) 

042 

(CONS  (LIST  (CATCHXVAR  FM)  VAR)  RNL)))))) 

043 

(: ABELS 

044 

(LET  ( ( F NVARS  (AMAPCAR  GENTEMP  (LABELSXF NVARS  FM)))) 

045 

(LET  ( (LENV  (PAIRLIS  (LABELSXUF NVARS  FM)  F NVARS  ENV)) 

046 

(LRNL  (PAIRLIS  (LABELSXF NVARS  FM)  F NVARS  RNL))) 

047 

(CONS-LABELS  ( UF NVARS  • (LABELSXUF NVARS  FM)) 

040 

(F NVARS  * F NVARS) 

049 

(FNDEFS  • (AMAPCAR 

050 

(LAMBDA  (N)  (COPY-NOOES  N LENV  LRNL)) 

051 

( LABELSXF NOEFS  FM))) 

052 

(BOOY  - (COPY-NOOES  (LABELSXBOOY  FM) 

053 

LENV 

054 

LR8L )))))) 

055 

(COMBINATION 

056 

(CONS -COMBINATION  (ARGS  • (AMAPCAR  (LAMBDA  (N)  (COPY-NOOES  N ENV  RNL)) 

057 

(COMB INAT ION XARGS  FM))) 

050 

( WARNP  • ( COMB INAT IONXWARNP  FM)))))) 

059 

( NOOEXSEXPR 

NOOE) 

060 

EMV))) 
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The  next  several  functions  process  the  node-tree  produced,  analyzed,  and 
optimized  by  pass  1,  converting  it  to  another  representation.  This  new 
representation  is  a tree  structure  very  similar  to  the  node-tree,  but  has 
different  components  for  the  pass-2  analysis.  We  will  call  this  the  "cnode- 
tree".  The  "c"  stands  for  "Continuation-passing  style":  for  the  conversion 
process  transforms  the  node-tree  into  a form  which  uses  continuation-passing  to 
represent  the  control  and  data  flow  within  the  program. 

We  define  a new  collection  of  data  types  used  to  construct  cnode-trees. 
The  CNOPF.  data  type  is  analogous  to  the  NODE  data  type;  one  component  CFORM 
contains  a variant  structure  which  is  specific  to  the  programming  construct 
represented  by  the  CNODE. 

The  types  CVARIABLE,  CLAMBDA,  CIF,  CA5ET,  CLABELS,  and  COMBINATION 
correspond  roughly  to  their  non-C  counterparts  in  pass  1. 

Type  TRIVIAL  is  used  to  represent  pieces  of  code  which  were  designated 
trivial  in  pass  1 (TRIVP  slot  = T)  by  TRIV-ANALYZE;  the  NODE  component  is  simply 
the  pass-1  node-tree  for  the  trivial  code.  This  is  the  only  case  in  which  part 
of  the  pass-1  node-tree  survives  the  conversion  process  to  be  used  in  pass  2. 

A CONTINUATION  is  just  like  a CLAMBDA  except  that  it  has  only  one  bound 
variable  VAR.  This  variable  can  never  appear  in  a CASET,  and  so  the  CONTINUATION 
type  has  no  ASETVARS  slot;  all  other  slots  are  similar  to  those  in  a CLAMBDA 
structure . 

A RETURN  structure  is  just  like  a CCOMBINATION,  except  that  whereas  a 
CCOMBINATION  may  invoke  a CLAMBDA  which  may  take  any  number  of  arguments,  a 
RETURN  may  invoke  only  a CONTINUATION  on  a single  value.  Thus,  in  place  of  the 
ARGS  slot  of  a CCOMBINATION,  which  is  a list  of  cnodes,  a RETURN  has  two  slots 
CONT  and  VAL,  each  of  which  is  a cnode. 

(In  retrospect,  this  was  somewhat  of  a design  error.  The  motivation  was 
that  the  world  of  closures  could  be  dichotomized  into  LAMBDA-closures  and 
cont.inviation-closures,  as  a result  of  the  fundamental  semantics  of  the  language: 
one  world  is  used  to  pass  values  "down"  into  functions,  and  the  other  to  pass 
values  "up"  from  functions.  Combinations  can  similarly  be  dichotimized,  and  I 
thought  it  would  be  useful  to  reflect  this  distinction  in  the  data  types  to 
enforce  and  error-check  this  dichotomy.  However,  as  it  turned  out,  there  is  a 
great  deal  of  code  in  pass  2 which  had  to  be  written  twice,  once  for  each 
"world",  because  the  data  types  involved  were  different.  It  would  be  better  to 
have  a single  structure  for  both  CLAMBDA  and  CONTINUATION,  with  an  additional 
slot  flagging  which  kind  it  was.  Then  most  code  in  pass  2 could  operate  on  this 
structure  without  regard  for  which  "world"  it  belonged  to,  and  code  which  cared 
could  check  the  flag.) 


CONVERSION  TO  CONT  INUAT  ION -PASS  ING  STYLE 
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;;;  THIS  INVOLVES  MAKING  A COMPLETE  COPY  OP  THE  PROGRAM  IN  TERMS 
;;;  OF  THE  FOLLOWING  NEW  DATA  STRUCTURES: 

( OEF  TYPE  CNOOE  (ENV  REFS  CLOVARS  CFORM) ) 

; ENV  ENVIRONMENT  (A  LIST  OF  VARIABLES.  NOT  A MAPPING;  DEBUGGING  ONLY) 

; RE F S VARIABLES  BOUNO  ABOVE  AND  REFERENCED  BELOW  THIS  CNODE 
;CLOVARS  VARIABLES  REFERRED  TO  AT  OR  BELOW  THIS  CNODE  BY  CLOSURES 
; (SHOULD  BE  A SUBSET  OF  REFS) 

;CFORM  ONE  Of  THE  BELOW  TYPES 
(OEF  TYPE  TRIVIAL  (NODE)) 

; NODE  A PASS-1  NODE  TREE 
(DEFTYPE  CVAR I ABLE  (VAR)) 

; VAR  GENERATED  VARIABLE  NAME 

(DEFTYPE  CLAMBDA  (VARS  BODY  FNP  TVARS  NAME  DEP  MAXDEP  CONSENV  CLOSEREFS  ASETVARS ) ) 
;FNP  NON-NIL  ■=>  NEEDN'T  MAKE  A FULL  CLOSURE  OF  THIS 
; CLAMBDA.  MAY  BE  ' NOCLOSE  OR  'EZCLOSE  (THE  FORMER 

; MEANING  NO  CLOSURE  IS  NECESSARY  AT  ALL.  THE  LATTER 

; THAT  THE  CLOSURE  IS  MERELY  THE  ENVIRONMENT). 

; TVARS  THE  VARIABLES  WHICH  ARE  PASSED  THROUGH  TEMP  LOCATIONS 

; ON  ENTRY.  NON-NIL  ONLY  IF  FNP= ' NOCLOSE ; THEN  IS 

; NORMALLY  THE  LAMBDA  VARS.  BUT  MAY  BE  DECREASED 

; TO  ACCOUNT  FOR  ARGS  WHICH  ARE  THEMSELVES  KNOWN  NOCLOSE'S, 

; OR  WHOSE  CORRESPONDING  PARAMETERS  ARE  NEVER  REFERENCED. 

; THE  TEMP  VARS  INVOLVED  START  IN  NUMBER  AT  DEP. 

; NAME  THE  PROG  TAG  USED  TO  LABEL  THE  FINAL  OUTPUT  CODE  FOR  THE  CLAMBDA 
;0EP  DEPTH  OF  TEMPORARY  REGISTER  USAGE  WHEN  THE  CLAMBDA  IS  INVOKED 
; MAXDEP  MAXIMUM  DEPTH  OF  REGISTER  USAGE  WITHIN  CLAMBDA  BODY 
;CONSENV  THE  "CONSED  ENVIRONMENT"  WHEN  THE  CLAMBDA  IS  EVALUATED 
;CLOSEREFS  VARIABLES  REFERENCED  BY  THE  CLAMBDA  WHICH  ARE  NOT  IN 
; THE  CONSED  ENVIRONMENT  AT  EVALUATION  TIME.  AND  SO  MUST  BE 

; ADDED  TO  CONSENV  AT  THAT  POINT  TO  MAKE  THE  CLOSURE 

; ASETVARS  THE  ELEMENTS  Of  VARS  WHICH  ARE  EVER  SEEN  IN  A CASET 
(DEFTYPE  CONTINUATION  (VAR  BODY  FNP  TVARS  NAME  DEP  MAXDEP  CONSENV  CLOSEREFS)) 
;COMPONENTS  ARE  AS  FOR  CLAMBDA 
(OEF  TYPE  C IF  (PRED  CON  ALT)) 

(DEFTYPE  CASET  (CONT  VAR  BODY)) 

(DEFTYPE  CLABELS  ( F NVARS  FNDEFS  FNENV  EASY  CONSENV  BODY)) 

;FNENV  A LIST  OF  VARIABLES  TO  CONS  ONTO  THE  ENVIRONMENT  BEFORE 
; CREATING  THE  CLOSURES  AND  EXECUTING  THE  BODY 

;EASY  NON-NIL  IFF  NO  LABELED  FUNCTION  IS  REFERRED  TO 

; AS  A VARIABLE.  CAN  BE  'NOCLOSE  OR  'EZCLOSE 

; (REFLECTING  THE  STATUS  OF  ALL  THE  LABELLED  FUNCTIONS) 

; CONSENV  AS  FOR  CLAMBDA 
(DEFTYPE  CCOMB INAT ION  (ARGS)) 

; ARGS  LIST  OF  CNODES  REPRESENTING  ARGUMENTS 
(DEFTYPE  RETURN  (CONT  VAL)) 

;CONT  CNODE  FOR  CONTINUATION 

;VAL  CNOOE  FOR  VALUE 
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CNODIFY  is  for  cnode-trees  what  NODIFY  was  for  node-trees.  It  takes  a 
CFORN  and  wraps  a CtfODE  structure  around  it. 


CONVF.RT  is  the  main  function  of  the  conversion  process;  it  is  invoked  by 
COMPILE  on  the  result  (META-VERSION)  of  pass  1.  CONVERT  dispatches  on  the  type  of 
node  to  be  converted,  often  calling  some  specialist  which  may  call  it  back 
recursively  to  convert  subnodes.  CONT  may  be  a cnode,  or  NIL.  If  it  is  a cnode, 
then  that  cnode  is  the  code  for  a continuation  which  is  to  receive  as  value  that 
produced  by  the  code  to  be  converted.  That  is,  when  CONVERT  finishes  producing 
code  for  the  given  node  (the  first  argument  to  CONVERT),  then  in  effect  a RETURN 
is  created  which  causes  the  value  of  the  generated  code  to  be  returned  to  the 
code  represented  by  CONT  (the  second  argument  to  CONVERT).  Sometimes  this  RETURN 
cnode  is  created  explicitly  (as  for  CONSTANT  and  VARIABLE  nodes),  and  sometimes 
only  implicitly,  by  passing  CONT  down  to  a specialist  converter. 

MP  is  T if  optimization  was  performed  by  pass  1,  and  NIL  otherwise.  This 
argument  is  for  debugging  purposes  only:  CONVERT  compares  this  to  the  METAP  slot 
of  the  pass-1  nodes  in  order  to  detect  any  failures  of  the  incremental 
optimization  and  analysis  process.  CONVERT  also  makes  some  other  consistency 
checks . 
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(DEFINE  CNOOIFY 

(LAMBDA  (CFORM) 

( CONS -CNODE  (CFORM  = CFORM)))) 
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(DEFINE  CONVERT 

(LAMBDA  (NODE  CONT  MP ) 

(LET  ((FM  ( NODE\FORM  NODE))) 

(IF  (EMPTY  (NODENTR1VP  NODE)) 

(ERROR  ' |Pass  1 analysis  missingl  NOOE  ’FAIL-ACT)) 

(OR  (EQ  ( NODE \ME TAP  NOOE)  MP) 

(ERROR  ' |Meta-evaluation  Screwed  Up  ME  TAP | NODE  ’FAIL-ACT)) 

(EQCASE  (TYPE  FM) 

(CONSTANT 

(OR  (NODE\TRIVP  NOOE) 

(ERROR  ' |Non-triviel  Constantl  NODE  ’FAIL-ACT)) 

(MAKE -RE TURN  (CONS-TRIVIAL  (NODE  ■=  NODE))  CONT)) 

(VARIABLE 

(OR  ( NODEVTR I VP.  NODE) 

(ERROR  ' INon-trivial  Variable!  ’FAIL-ACT)) 

(MAKE-RETURN  (CONS-TRIVIAL  (NOOE  = NODE))  CONT)) 

(LAMBDA  (MAKE-RETURN  (CONVERT-LAMBDA-FM  NODE  NIL  MP)  CONT)) 

(IF  (OR  CONT  (ERROR  ' |Null  Continuation  to  IF  | NODE  ’FAIL-ACT)) 

(CONVERT- IF  NODE  FM  CONT  MP)) 

(ASET  (OR  CONT  (ERROR  ’|Nu11  Continuation  to  ASET  | NODE  ’FAIL-ACT)) 
(CONVERT-ASET  NODE  FM  CONT  MP)) 

(CATCH  (OR  CONT  (ERROR  ’ | Nul 1 Continuation  to  CATCH|  NODE  ’FAIL-ACT)) 
(CONVERT-CATCH  NODE  FM  CONT  MP)) 

(LABELS  (OR  CONT  (ERROR  ’|Null  Continuation  to  LABELSI  NODE  ’FAIL-ACT)) 
(CONVERT-LABELS  NODE  FM  CONT  MP)) 

(COMBINATION  (OR  CONT  (ERROR  '|Null  Continuation  to  Combination! 

NODE 

•FAIL-ACT)) 

(CONVERT -COMB  I NAT  ION  NODE  FM  CONT  MP)))))) 
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MAKE-RETURN  takes  a CFORM  (one  of  the  types  TRIVIAL,  CVARIABLE,  ...)  and 
a continuation,  and  constructs  an  appropriate  returning  of  the  value  of  the  CFORM 
to  the  continuation.  First  the  CFORM  is  given  to  CNODIFY.  If  the  continuation 
is  in  fact  NIL  (meaning  none),  this  new  cnode  is  returned;  otherwise  a RETURN 
cnodo  is  constructed. 


CONVERT-LANBDA-FN  takes  a LAMBDA  node  and  converts  it  into  a CLAMBDA 
cnode.  The  two  are  isomorphic,  except  that  an  extra  variable  is  introduced  as  an 
extra  first  parameter  to  the  CLAMBDA.  Conceptually,  this  variable  will  be  bound 
to  a continuation  when  the  CLAMBDA  is  invoked  at  run  time;  this  continuation  is 
the  one  intended  to  receive  the  value  of  the  body  of  the  CLAMBDA.  This  is 
accomplished  by  creating  a new  variable  name  CONT-nnn,  which  is  added  into  the 
lambda  variables.  A new  CVARIABLE  node  is  made  from  it,  and  given  to  CONVERT  as 
the  continuation  when  the  body  of  the  LAMBDA  node  is  to  be  recursively  converted. 

The  CNAME  argument  is  used  for  a special  optimization  trick  by  CONVERT- 
COMBINATION,  described  below. 


CONVERT- IF  distinguishes  several  cases,  to  simplify  the  converted  code. 
First,  if  the  entire  IF  node  is  trivial,  then  a simple  CTRIVIAL  node  may  be 
created  for  it.  Otherwise,  the  general  strategy  is  to  generate  code  which  will 
bind  the  given  continuation  to  a variable  and  evaluate  the  predicate.  This 
predicate  receives  a continuation  which  will  examine  the  resulting  value  (with  a 
C IF ) , and  then  perform  either  the  consequent  or  alternative,  which  are  converted 
using  the  bound  variable  as  the  continuation.  The  reason  that  the  original 
continuation  is  bound  to  a variable  is  because  it  would  be  duplicated  by  using  it 
for  two  separate  calls  to  CONVERT,  thereby  causing  duplicate  code  to  be  generated 
for  it.  A schematic  picture  of  the  general  strategy  is: 

NODE  = (IF  a b c)  and  CONT  = k becomes 

((CLAMBDA  (q) 

(RETURN  (CONTINUATION  (p) 

(CIF  p 

(RETURN  q b) 

(RETURN  q c))) 


Now  there  are  two  special  cases  which  allow  simplification.  First,  if  the  given 
continuation  is  already  a cvariable,  there  is  no  point  in  creating  a new  one  to 
bind  it  to.  This  eliminates  the  outer  CCOMBINATION  and  CLAMBDA.  Second,  if  the 
predicate  a is  trivial  (but  the  whole  IF  is  not,  because  the  consequent  b or  the 
alternative  c is  non-trivial ) , then  the  CONTINUATION  which  binds  p is 
unnecessary. 
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002 

(OEF  INE 

MAKE-RETURN 

003 

(LAMBDA  ( CFORM  CONT) 

004 

(LET  ((CN  (CNOOIFY  CFORM))) 

005 

(IF  CONT 

006 

(CNOOIFY  (CONS-RETURN  (CONT  * CONT)  (VAL  * CN))) 

007 

CN)))) 

008 

009 

(DEE  INE 

CONVERT-LAMBDA-FM 

010 

(LAMBDA  (NODE  CNAME  MP) 

Oil 

(LET  ((CV  (GENTEMP  'CONT)) 

012 

(FM  ( NOOE\FORM  NODE))) 

013 

(CONS-CLAMBOX  (VARS  * (CONS  CV  ( LAMBDA\VARS  FM))) 

014 

(BODY  « (CONVERT  (LAMBDAXBOOY  FM) 

015 

(CNOOIFY 

016 

(CONS-CVARIABLE  (VAR  • (OR  CNAME  CV)))) 

017 

MP)))))) 

018 

019 

;;;  ISSUES  FOR  CONVERTING  IF. 

020 

;;;  <l) 

IF  WHOLE  IF  IS  TRIVIAL.  MAY  JUST  CREATE  A CTRIVIAL. 

021 

;;;  (2) 

IF  CONTINUATION  IS  NON-CVARIABLE . MUST  BIND  A VARIABLE  TO  IT. 

022 

(3) 

IF  PREDICATE  IS  TRIVIAL.  MAY  JUST  STICK  IT  IN  SIMPLE  CIF. 

023 

024 

(OEF INE 

CONVERT- IF 

025 

(LAMBDA  (NODE  FM  CONT  MP) 

026 

(IF  ( NODEXTRIVP  NODE) 

027 

(MAKE-RETURN  (CONS-TRIVIAL  (NODE  = NODE))  CONT) 

028 

(LET  ( ( CVAR  (IF  (EQ  (TYPE  (CNODEXCFORM  CONT))  'CVARIABLE ) 

029 

NIL 

030 

(GENTEMP  'CONT))) 

031 

( PVAR  (IF  ( NODEXTRIVP  (IF\PRE0  FM)) 

032 

NIL 

033 

(NODEXNAME  (IFXPREO  FM))))) 

034 

(LET  ((ICONT  (IF  CVAR 

035 

(CNOOIFY  (CONS-CVARIABLE  (VAR  - CVAR))) 

0 36 

CONT)) 

037 

( IPRED  ( IF  PVAR 

0 38 

(CNOOIFY  (CONS-CVARIABLE  (VAR  • PVAR))) 

039 

(CNOOIFY  (CONS-TRIVIAL  (NODE  « (IFXPREO  FM))))))) 

040 

(LET  ((CIF  (CNOOIFY 

041 

(CONS -CIF 

042 

(PREO  - IPRED) 

043 

(CON  . (CONVERT  (IFXCON  FM)  ICONT  HP)) 

044 

(ALT  • (CONVERT  ( IFXALT  FM) 

045 

(CNOOIFY 

046 

(CONS-CVARIABLE 

04  7 

(VAR  • (CVARIABLEXVAR 

048 

(CNOOEXCFORM  ICONT))))) 

049 

MP)))))) 

050 

(LET  ((FOO  (IF  PVAR 

051 

(CONVERT  (IFXPREO  FM) 

052 

(CNOOIFY  (LONS -CONTINUATION  (VAR 

053 

(BODY 

054 

MP) 

055 

CIF))) 

056 

(IF  CVAR 

057 

(CNOOIFY 

058 

( CONS -CCOMB I NAT  ION 

059 

(AR6S  • (LIST  (CNOOIFY 

060 

(CONS-CLAMBDA 

061 

(VARS  • (LIST  CVAR)) 

062 

(BOOY  * FOO))) 

063 

CONT)))) 

064 

POO)))))))) 

PVAR) 

• CIF ))) 


180 


This  is  all  done  as  follows.  First  CVAR  and  PVAR  are  bound  to  generated 
names  if  necessary,  CVAR  for  binding  the  continuation  and  PVAR  for  binding  the 
predicate  value.  Then  ICONT  and  IPRED  (the  "I"  is  a mnemonic  for  "internal")  are 
bound  to  the  cnodes  to  be  used  for  the  two  conversions  of  consequent  and 
alternative,  and  for  the  predicate  of  the  CIF,  respectively.  CIF  is  then  bound 
to  the  cnode  for  the  CIF  code,  including  the  conversions  of  consequent  and 
alternative.  Finally,  using  FOO  as  an  intermediary,  CONVERT-IF  first 
conditionally  arranges  for  conversion  of  a non-trivial  predicate,  and  then 
conditionally  arranges  for  the  binding  of  a non-cvariable  continuation.  The 
result  of  all  this  is  returned  as  the  final  conversion  of  the  original  IF  node. 


CONVERT-ASET  is  fairly  straightforward,  except  that,  as  for  IF  nodes,  a 
special  case  is  made  of  trivial  nodes,  as  determined  by  the  TRIVP  slot. 

The  CATCH  construct  may  be  viewed  as  the  user's  one  interface  between  the 
"LAMBDA  world"  and  the  "continuation  world".  CONVERT-CATCH  arranges  its 
conversion  in  such  a way  as  to  eliminate  CATCH  entirely.  Because  CONTINUATION 
cnodes  provide  an  explicit  representation  for  the  continuations  involved,  there 
is  no  need  at  this  level  to  have  an  explicit  CCATCH  sort  of  cnode.  The  general 
idea  is: 


NODE  = (CATCH  a b)  and  CONT  = k becomes 
((CLANBDA  (q) 

((CLAMBDA  (a)  (RETURN  q b)) 

(CLAMBDA  ( *IGNORE*  V)  (RETURN  q V)))) 
k) 

In  the  case  where  the  given  continuation  k is  already  a cvariable,  then  it  need 
not  be  bound  to  a new  one  q.  Note  that  the  (renamed)  user  catch  variable  a is 
bound  to  a CLAMBDA  which  ignores  its  own  continuation,  and  returns  the  argument  V 
to  the  continuation  of  the  CATCH.  Thus  the  user  variable  a is  bound  not  to  an 
actual  CONTINUATION,  but  to  a little  CLAMBDA  which  interfaces  properly  between 
the  CLAMBDA  world  and  the  CONTINUATION  world.  The  uses  of  CVAlR  and  ICONT  are 
analogous  to  their  uses  in  CONVERT-IF. 
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(DEFINE  CONVERT- ASET 

(LAMBDA  (NODE  FM  CONT  MP) 

(IF  (NOOEYIRIVP  NODE) 

(MAKE-RETURN  (CONS- TR I VI AL  (NODE  • NODE))  CONT) 

(CONVERT  (ASETXBODY  FM) 

(LET  ((NM  (NOOEXNAME  (ASET\BOOY  FM)))) 

(CNOOIFY 

(CONS -CONTINUATION 
(VAR  = NM) 

(BODY  « (CNOOIFY 

(CONS -CASE T 
(CONT  • CONT) 

(VAR  « (ASETNVAR  FM)) 

( BODY  * (CNOOIFY  (CONS-CVARIABLE 

(VAR  - NM)))))))))) 

HP)))) 


;;  ISSUES  FOR  CONVERTING  CATCH: 

;;  (1)  MUST  BIND  THE  CATCH  VARIABLE  TO  A FUNNY  FUNCTION  WHICH  IGNORES  ITS  CONTINUATION: 
;;  (2)  IF  CONTINUATION  IS  NON -CVAR 1 ABLE . MUST  BIND  A VARIABLE  TO  IT. 


(0EF1NE 
CONVERT-CATCH 
(LAMBDA  (NODE  FM  CONT  MP ) 

(LET  ( (CVAR  (IF  (EQ  (TYPE  (CNOOENCFORM  CONT ) ) ‘CVARIABLE ) 

NIL 

(GENTEMP  ’CONT)))) 

(LET  ( ( 1CONT  (IF  CVAR 

(CNOOIFY  (CONS-CVARIABLE  (VAR  • CVAR))) 

CONT ) ) ) 

(LET  ((CP  (CNOOIFY 

( CONS -CCOMB I NAT ION 
(ARGS  - (LIST  (CNOOIFY 

(CONS-CLAMBDA 

(VARS  - (LIST  (CATCHWAR  FM))) 

(BODY  * (CONVERT  (CATCH\BOOY  FM)  ICONT  MP)))) 

(CNOOIFY 
(CONS-CLAMBDA 
(VARS  • •(* IGNORE*  V)) 

(BODY  • (MAKE-RETURN 

(CONS-CVARIABLE  (VAR  • ’V)) 

(CNOOIFY 
(CONS-CVARIABLE 
(VAR  * (CVAR  I ABLE  WAR 

(CNOOESCFORM  ICONT))))))))))))))) 

(IF  CVAR  (CNOOIFY 

(CONS -CCOMB INAT ION 


(ARGS 

CP)))))) 


(LIST  (CNOOIFY 

(CONS-CLAMBDA  (VARS  • (LIST  CVAR)) 
(BOOY  * CP))) 

CONT)))) 
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CONVERT-LABELS  simply  converts  all  the  labelled  function  definitions 
using  NIL  as  the  continuation  for  each.  This  reflects  the  fact  that  no  code 
directly  receives  the  results  of  closing  the  definitions;  rather,  they  simply 
become  part  of  the  environment.  The  body  of  the  LABELS  is  converted  using  the 
given  continuation. 

To  make  things  much  simpler  for  the  pass-Z  analysis  and  the  code 
generator,  it  is  forbidden  to  use  ASET'  on  a LABELS-bound  variable.  This  is  an 
arbitrary  restriction  imposed  by  RABBIT  (out  of  laziness  on  my  part  and  a desire 
to  concentrate  on  more  important  issues),  and  not  one  inherent  in  the  SCHEME 
language.  This  restriction  is  unnoticeable  in  practice,  since  one  seldom  uses 
ASET'  at  all,  let  alone  on  a LABELS  variable. 


The  conversion  of  COMBINATION  nodes  is  the  most  complex  of  all  cases. 
First,  a trivial  combination  becomes  simply  a TRIVIAL  cnode.  Otherwise,  the 
overall  idea  is  that  each  argument  is  converted,  and  the  continuation  given  to 
the  conversion  is  the  conversion  of  all  the  following  arguments.  The  conversion 
of  the  last  argument  uses  a continuation  which  performs  the  invocation  of 
function  on  arguments,  using  all  the  bound  variables  of  the  generated 
continuations.  The  end  result  is  a piece  of  code  which  evaluates  one  argument, 
binds  a variable  to  the  result,  evaluates  the  next,  etc.,  and  finally  uses  the 
results  to  perform  a function  call. 

To  simplify  the  generated  code,  the  arguments  are  divided  into  two 
classes.  One  class  consists  of  trivial  arguments  and  LAMBDA-expressions  (this 
class  is  precisely  the  class  of  "trivially  evaluable”  expressions  defined  in 
[Imperative]),  and  the  other  class  consists  of  the  remaining  arguments.  The 
successive  conversion  using  successive  continuations  as  in  the  general  theory  is 
only  performed  on  the  latter  class  of  arguments!  The  trivially  evaluable 
expressions  are  included  along  with  the  bound  variables  for  non-trivial  argument 
values  in  the  final  function  call.  For  example,  one  might  have  something  like: 


NODE  = (F00  (CONS  A B)  (BAR  A)  B (BAZ  B))  and  CONT  = k becomes 


(RETURN  (CONTINUATION 


(BAR  A)) 


(x) 

(RETURN  (CONTINUATION  (y) 

(FOO  k (CONS  A B)  x B y)) 

(BAZ  B))) 


where  FOO, 


(CONS  A B>,  and  B are  trivial,  but  (BAR  A)  and  (BAZ  B)  are  not. 
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;;;  ISSUES  FOR  CONVERTING  LABELS: 

;;;  (1)  MUST  CONVERT  ALL  THE  NAMEO  LAMBDA-EXPRESSIONS,  USING  A NULL  CONTINUATION. 

;;;  (2)  TO  MAKE  THINGS  EASIER  LATER.  WE  FORBID  ASET  ON  A LABELS  VARIABLE. 

(OEFINE  CONVERT-LABELS 

(LAMBDA  (NODE  FM  CONT  MP) 

(00  ( (F  ( LABELS\F NDEFS  FM)  (COR  F)) 

(V  (LABELSNFNVARS  FM)  (COR  V)) 

(CF  NIL  (CONS  (CONVERT  (CAR  F)  NIL  MP)  CF ) ) ) 

((NULL  F) 

( CNODIFY  (CONS-CLABELS  (FNVARS  * ( LABELSNFNVARS  FM)) 

(FNDEFS  ■=  (NREVERSE  CF)) 

(BODY  - (CONVERT  (LABELSNBODY  FM)  CONT  MP))))) 
(AND  (GET  (CAR  V)  ’WRITE-REFS) 

(ERROR  ' |Are  you  crazy,  using  ASET  on  a LABELS  var"iable?| 

(CAR  V) 

’FAIL-ACT))))) 
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The  separation  into  two  classes  is  accomplished  by  the  outer  DO  loop. 

DELAY-FLAGS  is  a list  of  flags  describing  whether  the  code  can  be  "delayed"  (not 
converted  using  strung-out  continuations)  because  it  is  trivially  evaluable.  The 
inner  DO  loop  of  the  three  (which  loops  on  variables  A,  j,  and  Z,  not  A,  D,  and 
F!  ) then  constructs  the  final  function  call,  using  the  "delayed"  arguments  and 
generated  continuation  variables.  The  names  used  for  the  variables  are  the  names 
of  the  corresponding  nodes,  which  were  generated  by  NODIFY.  Finally,  the  middle 
DO  loop  (which  executes  last  because  the  "inner*  DO  loop  occurs  in  the 
initialization,  not  the  body,  of  the  "middle"  one)  generates  the  strung-out 
continuations,  converting  the  non-delayable  arguments  in  reverse  order,  so  as  to 
generate  the  converted  result  from  the  inside  out. 

The  net  effect  is  that  non-trivial  arguments  are  evaluated  from  left-to- 
right,  and  trivial  ones  are  also  (as  it  happens,  because  of  MacLISP  semantics), 
but  the  two  classes  are  intermixed.  This  is  where  RABBIT  takes  advantage  of  the 
SCHEME  semantics  which  decree  that  arguments  to  a combination  may  be  evaluated  in 
any  order.  It  is  also  why  CHECK-COMBINATION-PEFFS  tries  to  detect  infractions  of  , 

this  rule.  \ 

A special  trick  is  that  if  the  given  continuation  is  a variable,  and  the 
combination  is  of  the  form  ((LAMBDA  ...)  ...),  then  it  is  arranged  to  use  the 
given  continuation  as  the  continuation  for  converting  the  body  of  the  LAMBDA,  < 

rather  than  the  extra  variable  which  is  introduced  for  a continuation  in  the 
LAMBDA  variables  list  (see  CONVERT-LAMBDA-FM) . This  effectively  constitutes  the 
optimization  of  substituting  one  continuation  variable  for  another,  much  as  META- 
COMBINATION-LAMBDA may  substitute  one  variable  for  another.  (This  turns  out  to 
be  the  only  optimization  of  importance  to  be  done  on  pass-Z  cnode  code;  rather 
than  building  a full-blown  optimizer  for  pass-Z  cnode-trees,  or  arranging  to  make 
the  optimizer  usable  on  both  kinds  of  data  structures,  it  was  easier  to  tweak  the 
conversion  of  combinations.)  The  substitution  is  effected  by  passing  a non-NIL 
CNAME  argument  to  CONVERT-LAMBDA-FORM,  as  computed  by  the  form  (AND  (NULL  (CDR 
A))  ...). 
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;;;  ISSUE  5 FOR  CONVERT ING  COMBINATIONS : 

;;;  (1)  TRIVIAL  ARGUMENT  EVALUATIONS  ARE  DELAYED  AND  ARE  NOT  BOUND  TO  THE  VARIABLE  OF 
;;;  A CONTINUATION.  WE  ASSUME  THEREBY  THAT  THE  COMPILER  IS  PERMITTED  TO  EVALUATE 
;;;  OPERANDS  IN  ANY  ORDER. 

;;;  (2)  ALL  NON-DELAYABLE  COMPUTATIONS  ARE  ASSIGNED  NAMES  AND  STRUNG  OUT  WITH  CONTINUATIONS. 
;;;  (3)  IF  CONT  IS  A CVARIABLE  AND  THE  COMBINATION  IS  ((LAMBDA  ...)  ...)  THEN  WHEN  CONVERTING 
;;;  THE  LAMBDA-EXPRESSION  WE  ARRANGE  FOR  ITS  BODY  TO  REFER  TO  THE  CVARIABLE  CONT  RATHER 
;;;  THAN  TO  ITS  OWN  CONTINUATION.  THIS  CROCK  EFFECTIVELY  PERFORMS  THE  OPTIMIZATION  OF 
;;;  SUBSTITUTING  ONE  VARIABLE  FOR  ANOTHER.  ONLY  ON  CONTINUATION  VARIABLES  (WHICH  COULDN'T 
;;;  BE  CAUGHT  BY  META-EVALUATE ) . 

(DEF INE 

CONVERT-COMBINATION 
(LAMBDA  (NODE  FM  CONT  MP) 

(IF  ( NODE\TR IVP  NODE) 

(MAKE-RETURN  (CONS-TRIVIAL  (NODE  = NODE))  CONT) 

(OO  ((A  (COMB I NAT ION\ARGS  FM)  (CDR  A)) 

(DELAY-FLAGS  NIL 

(CONS  (OR  ( NOOE\TRIVP  (CAR  A)) 

(EO  (TYPE  (NOOEXFORM  (CAR  A)))  ’LAMBDA)) 
DELAY-FLAGS))) 

((NULL  A) 

(DO  ((A  (REVERSE  ( COMB  l N AT IONXARGS  FM))  (CDR  A)) 

(D  DELAY-FLAGS  (COR  D)) 

(F  (CNOOIFY 

(CONS  - CCOMB I NAT  1 ON 

(ARGS  = (00  ((A  (REVERSE  ( COMB  I NAT ION\ARGS  FM))  (COR  A)) 

(D  DELAY-FLAGS  (CDR  D)) 

(Z  NIL  (CONS  (IF  (CAR  0) 

(IF  (EO  (TYPE  (NOOE\FORM  (CAR  A))) 
’LAMBDA) 

(CNOOIFY 

(CONVERT-LAMBOA-FM 
(CAR  A) 

(AND  (NULL  (CDR  A)) 

(EO  (TYPE 

" (CNODEXCFORM  CONT)) 

’CVARIABLE) 
(CVARIABLE  WAR 
(CNODEXCFORM  CONT))) 

MP)) 

(CNODIFY 
(CONS-TRIVIAL 
(NODE  • (CAR  A))))) 

(CNODIFY 
(CONS -CVARIABLE 

(VAR  • ( NODE X NAME  (CAR  A)))))) 

Z))) 

((NULL  A)  (CONS  (CAR  Z)  (CONS  CONT  (COR  Z)))))))) 

(IF  (CAR  D)  F 

(CONVERT  (CAR  A) 

(CNOOIFY  ( CONS -CONT INUAT ION 

(VAR  • (NODEXNAME  (CAR  A))) 

(BOOY  • F))) 

MP ) ) ) ) 

((NULL  A)  F))))))) 


J 
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Once  the  pass-Z  cnode-tree  is  constructed,  a pass-Z  analysis  is  performed 
in  a manner  very  similar  to  the  pass-1  analysis.  As  before,  successive  routines 
are  called  which  recursively  process  the  code  tree  and  pass  information  up  and 
down,  filling  in  various  slots  and  putting  properties  on  the  property  lists  of 
variable  names. 

The  first  routine,  CENV-ANALYZE,  is  similar  to  ENV-ANALYZE,  but  differs 
in  some  important  respects.  Two  slots  are  filled  in  for  each  cnode.  The  slot 
ENV  is  computed  from  the  top  down,  while  REFS  is  computed  from  the  bottom  up. 

ENV  is  the  environment,  a list  of  bound  variables  visible  to  the  cnode. 
The  ENV  slot  in  the  node-tree  was  a mapping  (an  alist),  but  this  ENV  is  only  a 
list.  The  argument  ENV  is  used  in  the  analysis  of  CVARIABLE  and  CASET  nodes. 
The  cnode  slot  ENV  is  included  only  for  debugging  purposes,  and  is  never  used  by 
RABBIT  itself. 

REFS  is  analogous  to  the  REFS  slot  of  a node-tree:  it  is  the  set  of 
variables  bound  above  and  referenced  below  the  cnode.  It  differs  from  the  pass-1 
analysis  in  that  variables  introduced  to  name  continuations  and  variables  bound 
by  continuations  are  also  accounted  for.  In  the  case  of  a TRIVIAL  cnode, 
however,  the  REFS  are  precisely  those  of  the  contained  node. 

The  argument  FNP  to  CENV-ANALYZE  in  non-NIL  iff  the  given  cnode  occurs  in 
"functional  position"  of  a CCOMBINATION  or  RETURN  cnode.  This  is  used  when  a 
variable  is  encountered;  on  the  property  list  a VARIABLE-REFP  property  is  placed 
iff  FNP  is  NIL,  indicating  that  the  variable  was  referenced  in  "variable  (non- 
function) position".  This  information  will  be  used  by  the  next  phase,  BIND- 
ANALYZE . 
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WE  WISH  TO  DETERMINE  THE  ENVIRONMENT  AT  EACH  CNOOE . 
AND  DETERMINE  WHAT  VARIABLES  ARE  BOUND  ABOVE  AND 
REFERRED  TO  BELOW  EACH  CNODE . 


FOR  EACH  CNODE  WE  FILL  IN  THESE  SLOTS: 

ENV  THE  ENVIRONMENT  SEEN  AT  THAT  CNODE  (A  LIST  OF  VARS) 

REFS  VARIABLES  BOUND  ABOVE  AND  REFERRED  TO  BELOW  THAT  CNOOE 

FOR  EACH  VARIABLE  REFERRED  TO  IN  NON-FUNCTION  POSITION 

BY  A CVAR I ABLE  OR  CTRIVIAL  CNOOE  WE  GIVE  A NON-NIL  VALUE  TO  THE  PROPERTY: 
VAR  1 ABLE -REFP 


;;;  FNP  IS  NON-NIL  IFF  CNODE  OCCURS  IN  FUNCTIONAL  POSITION 


(DEF INE 
CENV-ANALYZE 
(LAMBDA  (CNOOE  ENV  FNP) 

(LET  ( (CFM  ( CNODE \CFORM  CNODE))) 

( ALTER-CNOOE  CNODE  (ENV  :=  ENV)) 

(EOCASE  (TYPE  CFM) 

(TRIVIAL 

(CENV-TRIV-ANALYZE  ( TRIVIALXNODE  CFM)  FNP) 

(ALTER-CNOOE  CNODE 

(REFS  :*  ( NOOE\REF S ( TRIVIAL \NOOE  CFM))))) 

(CVARIABLE 

(LET  ((V  (CVARIABLE WAR  CFM))) 

( ADDPROP  V CNODE  'READ-REFS) 

(OR  FNP  (PUTPROP  V T ' VARIABLE -REFP ) ) 

(ALTER-CNOOE  CNOOE 

(REFS  :*  (AND  (MEMO  V ENV) 

(LIST  (CVAR  I ABLE  WAR  CFM))))))) 

(CLAMBDA 

(LET  ((B  (CLAMBDA\BOOY  CFM))) 

(CENV-ANALYZE  B (APPEND  (CLAMBDAWARS  CFM)  ENV)  NIL) 

(LET  ((REFS  (SETDIFF  (CNODE \REFS  B)  (CLAMBDAWARS  CFM)))) 
(ALTER-CNOOE  CNOOE  (REFS  :*  REFS))))) 

(CONTINUATION 

(LET  ((B  ( CONTINUAT IONXBOOY  CFM))) 

(CENV-ANALYZE  B (CONS  (CONTINUATIONWAR  CFM)  ENV)  NIL) 

(LET  ((REFS  (REMOVE  (CONTINUATIONWAR  CFM)  (CNOOEXREFS  B)))) 
(ALTER-CNOOE  CNOOE  (REFS  :•  REFS))))) 

(C  IF 

(LET  ((PRED  (CIFXPREO  CFM)) 

(CON  (CIFXCON  CFM)) 

(ALT  (CIFXALT  CFM))V 
(CENV-ANALYZE  PREO  ENV  NIL) 

(CENV-ANALYZE  CON  ENV  NIL) 

(CENV-ANALYZE  ALT  ENV  NIL) 

(ALTER-CNOOE  CNOOE 

(REFS  :•  (UNION  (CNOOEXREFS  PRED) 

(UNION  (CNOOEXREFS  CON) 

(CNOOEXREFS  ALT))))))) 


(CASET 

(LET  ((V  (CASETWAR  CFM)) 

(CN  (CASETXCOHT  CFM)) 

(B  (CASE TXBOOY  CFM))) 

(PUTPROP  (CASETWAR  CFM)  T ' VAR IABLE -REFP ) 

(CENV-ANALYZE  CN  ENV  T) 

(CENV-ANALYZE  B ENV  NIL) 

(ALTER-CNOOE  CNOOE 

(REFS  :•  (LET  ((R  (UNION  (CNOOEXREFS  CN) 

(CNOOEXREFS  6)))) 

(IF  (MEMO  V ENV)  (ADJOIN  V R)  R) ) ) ) ) > 

(CLABELS 

(LET  ((LENV  (APPEND  ( CL ABEL S\F NVARS  CFM)  ENV))) 

(DO  (<F  (CLABELSXFNDEFS  CFM)  (COR  F)) 

(R  NIL  (UNION  R (CNOOEXREFS  (CAR  F))))) 


•a. 
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070 
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((NULL  F) 

071 

(LET  ((B  ( CLABELSNBODY  CM))! 

072 

(CENV-ANALYZE  B LENV  NIL) 

073 

( ALTER-CNOOE  CNOOE 

074 

(REFS  :*  (SETDIFF  (UNION  R (CNOOENREFS  B)) 

1 075 

(CLABELSNFNVARS  CFM)))))) 

0 76 

(CENV-ANALYZE  (CAR  F)  LENV  NIL)))) 

077 

( CCOMB INAT ION 

0 78 

(LET  ((ARGS  ( CCOMB INAT ION \ARGS  CFM))) 

079 

(CENV-ANALYZE  (CAR  ARGS)  ENV  T) 

080 

(COND  ((AND  (EQ  (TYPE  (CNOOENCFORM  (CAR  ARGS)))  'TRIVIAL) 

081 

( EQ  (TYPE  (NOOENFORM  ( TRIVIALNNOOE 

082 

(CNOOENCFORM  (CAR  ARGS))))) 

083 

•VARIABLE) 

084 

(TRIVFN  ( VAR  I ABLE  WAR 

085 

(NODENFORM 

086 

( TRIVIAL NNODE 

087 

(CNOOENCFORM 

088 

(CAR  ARGS))))))) 

089 

(CENV-ANALYZE  (CAOR  ARGS)  ENV  T) 

090 

( CENV-CCOMB INAT ION- ANALYZE  CNOOE 

091 

ENV 

092 

(CODR  ARGS) 

093 

(UNION  (CNOOENREFS  (CAR  ARGS)) 

094 

(CNODENREFS  (CAOR  ARGS))))) 

095 

(T  ( CENV-CCOMB INAT ION- ANALYZE  CNOOE 

096 

ENV 

097 

(COR  ARGS) 

0 98 

(CNODENREFS  (CAR  ARGS))))))) 

099 

(RETURN 

100 

(LET  ( ( C (RETURNNCONT  CFM)) 

101 

(V  (RETURNNVAL  CFM))) 

102 

(CENV-ANALYZE  C ENV  T) 

103 

(CENV-ANALYZE  V ENV  NIL) 

104 

(ALTER-CNODE  CNOOE 

; 105 

* 

(REFS  :«  (UNION  (CNODENREFS  C)  (CNOOENREFS  V)))))))))) 
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The  only  purpose  of  CENV-TRIV-ANALYZE  is  to  go  through  the  code  for  a 
TRIVIAL  cnode,  looking  for  variables  occurring  in  other  than  function  position, 
in  order  to  put  appropriate  VARIABLE-REFP  properties.  Notice  that  the  types 
LAMBDA  and  LABELS  do  not  occur  in  the  EQCASE  expression,  as  nodes  of  those  types 
can  never  occur  in  trivial  expressions. 

CENV-CCOMBINATION-ANALYZE  is  a simple  routine  which  analyzes  CCOMBINATION 
cnodes;  it  is  a separate  routine  only  because  it  is  used  in  more  than  one  place 
in  CENV-ANALYZE.  It  could  have  been  made  a local  subroutine  by  using  a LABELS  in 
CENV-ANALYZE,  but  I elected  not  to  do  so  for  purely  typographical  reasons. 
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002  ;;;  THIS  FUNCTION  MUST  GO  THROUGH  AND  LOCATE  VARIABLES  APPEARING  IN  NON-FUNCTION  POSITION. 

003 

00 A (OEFINE  CENV-TRIV-ANALYZE 

005  (LAMBDA  (NODE  FNP) 

006  (LET  ((FM  ( NODESFORM  NODE))) 

007  (EQCASE  (TYPE  FM) 

OOS  (CONSTANT  NIL) 

009  (VARIABLE 

010  (OR  FNP  (PUTPROP  ( VARIABLENVAR  FM)  T 'VAR  I ABLE -REF  P ) ) ) 

011  ( LAMBDA 

012  (OR  FNP 

013  (ERROR  ’ |Tr1v1al  closure  - CENV-TRIV-ANALYZE | NOOE  ‘FAIL-ACT)) 

014  (CENV-TRIV-ANALYZE  (LAMBDAVBODY  FM)  NIL)) 

015  (IF 

016  (CENV-TRIV-ANALYZE  ( IF\PRED  FM)  NIL) 

017  (CENV-TRIV-ANALYZE  (IFNCON  FM)  NIL) 

018  (CENV-TRIV-ANALYZE  ( IF\ALT  FM)  NIL)) 

019  (ASET 

020  (PUTPROP  (ASETWAR  FM)  T ' VARIABLE-REFP) 

021  (CENV-TRIV-ANALYZE  (ASETNBOOY  FM)  NIL)) 

022  (COMBINATION 

023  (DO  ((A  ( COMB  I NAT I ON\ARGS  FM)  (COR  A)) 

024  (F  T NIL)) 

025  ((NULL  A)) 

026  (CENV-TRIV-ANALYZE  (CAR  A)  F))))))) 

027 

028  (DEFINE  CENV-CCOMBINATION-ANALYZE 

029  (LAMBDA  (CNODE  ENV  ARGS  FREFS) 

030  (DO  ((A  ARGS  (COR  A)) 

031  (R  FREFS  (UNION  R (CNODENREFS  (CAR  A))))) 

032  ((NULL  A) 

033  (ALTER-CNOOE  CNODE  (REFS  :•  R))) 

034  (CENV-ANALYZE  (CAR  A)  ENV  NIL)))) 


The  binding  analysis  is  the  most  complicated  phase  of  pass  2.  It 
determines  for  each  function  whether  or  not  a closure  structure  will  be  needed 
for  it  at  run  time  (and  if  so,  whether  the  closure  structure  must  contain  a 
pointer  to  the  code);  it  determines  for  each  variable  whether  or  not  it  can  be 
referred  to  by  a run-time  closure  structure;  and  it  determines  for  each  function 
how  arguments  will  be  passed  to  it  (because  for  internal  functions  not  apparent 
to  the  "outside  world",  any  arbitrary  argument-passing  convention  may  be  adopted 
by  the  compiler  to  optimize  register  usage;  in  particular,  arguments  which  are 
never  referred  to  need  never  even  be  actually  passed).  If  flow  analysis 
determines  that  a given  variable  always  denotes  (a  closure  of)  a given  functional 
(CLANBDA)  expression,  then  a KNOWN-FUNCTION  property  is  created  to  connect  the 
variable  directly  to  the  function  for  the  benefit  of  the  code  generator. 

BIND-ANALYZE  is  just  a simple  dispatch  to  one  of  many  specialists,  one 
for  each  type  of  CNODE.  TRIVIAL  and  CVARIABLE  cnodes  are  handled  directly 
because  they  are  simple. 

The  argument  FNP  is  NIL,  EZCLOSE,  or  NOCLOSE,  depending  respectively  on 
whether  a full  closure  structure,  a closure  structure  without  a code  pointer,  or 
no  closure  structure  will  be  needed  if  in  fact  CNODE  turns  out  to  be  of  type 
CLAMBDA  (or  CONTINUATION).  Normally  it  is  NIL,  unless  determined  otherwise  by  a 
parent  CLABELS  or  COMBINATION  cnode. 

The  argument  NAME  is  meaningful  only  if  the  CNODE  argument  is  of  type 
CLAMBDA  or  CONTINUATION.  If  non-NIL,  it  is  a suggested  name  to  use  for  the 
cnode.  This  name  will  later  be  used  by  the  code  generator  as  a tag.  The  only 
reason  for  using  the  suggestion  rather  than  a generated  name  (and  in  fact  one 
will  be  generated  if  the  suggested  name  is  NIL)  is  to  make  it  easier  to  trace 
things  while  debugging. 


REFD-VARS  is  a utility  routine.  Given  a set  of  variables,  it  returns  the 
subset  of  them  that  are  actually  referenced  (as  determined  by  the  READ-REFS  and 
WRITE-REFS  properties  which  were  set  up  by  ENV-ANALYZE  and  CENV-ANALYZE) . 


; ; ; BINDING  ANALYSIS. 
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FOR  IACH  CNOOE  WE  FILL  IN: 

;;;  CLOVARS  THE  SET  OF  VARIABLES  REFERRED  TO  BY  CLOSURES 

;;;  AT  OR  BELOW  THIS  NOOE  (SHOULD  ALWAYS  BE  A 

; ; ; SUBSET  OF  REFS) 

FOR  EACH  CLAMBDA  AND  CONTINUATION  WE  FILL  IN: 

;;;  FNP  NON-NIL  IFF  REFERENCED  ONLY  AS  A FUNCTION. 

;;;  WILL  BE  ’EZCLOSE  IF  REFERRED  TO  BY  A CLOSURE, 

;;;  AND  OTHERWISE  ’NOCLOSE. 

;;;  TVARS  VARIABLES  PASSED  THROUGH  TEMP  LOCATIONS  WHEN  CALLING 

; ; ; THIS  FUNCTION 

;;;  NAME  THE  NAME  OF  THE  FUNCTION  (USED  FOR  THE  PROG  TAG) 

;;;  FOR  EACH  CLABELS  WE  FILL  IN: 

;;;  EASY  REFLECTS  FNP  STATUS  OF  ALL  THE  LABELLED  FUNCTIONS 

;;;  FOR  EACH  VARIABLE  WHICH  ALWAYS  OENOTES  A CERTAIN  FUNCTION  WE 
; ; ; PUT  THE  PROPERTIES: 

;;;  KNOWN-FUNCTION  IFF  THE  VARIABLE  IS  NEVER  ASET 

;;;  THE  VALUE  OF  THE  KNOWN-FUNCTION  PROPERTY  IS  THE  CNOOE  FOR 
;;;  THE  FUNCTION  DEFINITION. 

;;;  FOR  EACH  LABELS  VARIABLE  IN  A LABELS  OF  THE  'EZCLOSE  VARIETY 
WE  PUT  THE  PROPERTY: 

;;;  LABELS -FUNCTION 

;;;  TO  INDICATE  THAT  ITS  "EASY"  CLOSURE  MUST  BE  COR'D  TO  GET  THE 
;;;  CORRECT  ENVIRONMENT  (SEE  PRODUCE-LABELS). 

;;;  NAME,  IF  NON-NIL,  IS  A SUGGESTED  NAME  FOR  THE  FUNCTION 

(DEFINE  BINO-ANALYZE 

(LAMBDA  (CNODE  FNP  NAME) 

(LET  ( (CFM  (CNODENCFORM  CNODE))) 

(EQCASE  (TYPE  CFM) 

(TRIVIAL 

(ALTER-CNODE  CNODE  (CLOVARS  :«  NIL))) 

(CVARIABLE 

(ALTER-CNODE  CNODE  (CLOVARS  :«  NIL))) 

(CLAMBDA 

(BIND-ANAL YZE-CLAMBDA  CNODE  FNP  NAME  CFM)) 
(CONTINUATION 

(BIND-ANAL  YZE.-CONTINUATION  CNODE  FNP  NAME  CFM)) 
(CIF 

(BIND  ANALYZE -Cl F CNODE  CFM)) 

(CASET 

(BIND- ANAL YZE-CASET  CNOOE  CFM)) 

(CLABELS 

(BIND-ANAL YZE-CLABELS  CNOOE  CFM)) 

(CCOMBINATION 

(BINO-ANALYZE -CCOMBINATION  CNODE  CFM)) 

(RETURN 

(BIND-ANAL YZE-RETURN  CNOOE  CFM)))))) 

(DEFINE  REFO-VARS 

(LAMBDA  (VARS) 

(DO  ((V  VARS  (CDR  V)) 

(W  NIL  (IF  (OR  (GET  (CAR  V)  ’READ-REFS) 

(GET  (CAR  V)  ’WRITE-REFS)) 

(CONS  (CAR  V)  W) 

V))) 

((NULL  V)  (NRE VERSE  V))))) 
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For  a CLANBDA  cnode,  BIND-ANALYZE-CLAMBDA  first  analyzes  the  body.  The 
CLOVARS  component  of  the  cnode  is  then  calculated.  If  the  CLANBDA  will  have  a 
run-time  closure  structure  created  for  it,  then  any  variable  it  references  is 
obviously  referred  to  by  a closure.  Otherwise,  only  the  CLOVARS  of  its  body  are 
included  in  the  set. 


The  TVARS  component  is  the  set  of  parameters  for  which  arguments  will  be 
passed  in  a non-standard  manner.  Non-standard  argument-passing  is  used  only  for 
NOCLOSE- type  functions  (though  in  principle  it  could  also  be  used  for  EZCLOSE- 
type  functions  also).  In  this  case,  only  referenced  variables  (as  determined  by 
REFD-VARS)  are  actually  passed.  The  code  generator  uses  TVARS  for  two  purposes: 
when  compiling  the  CLANBDA  itself,  TVARS  is  used  to  determine  which  arguments  are 
in  which  registers;  and  when  compiling  calls  to  the  function,  TVARS  determines 
which  registers  to  load  (see  LANBDACATE). 


The  FNP  slot  is  just  filled  in  using  the  FNP  parameter, 
not  suggested  for  the  NAHE  slot,  an  arbitrary  name  is  generated. 


If  a name  was 


i 


BIND-ANALYZE-CONTINUATION  is  entirely  analogous  to  BIND-ANALYZE-CLANBDA. 

BIND-ANALYZE-CIF  straightforwardly  analyzes  recursively  its  sub-cnodes, 
and  then  passes  the  union  of  their  CLOVARS  up  as  its  own  CLOVARS. 


BIND-ANALYZE-CASET  tries  to  be  a little  bit  clever  about  the  obscure  case 
produced  by  code  such  as: 

(ASET1  FOO  (LAMBDA  ...)) 

where  the  continuation  is  a CONTINUATION  cnode  (rather  than  a CVARIABLE).  It  is 
then  known  that  the  variable  bound  by  the  CONTINUATION  (not  the  variable  set  by 
the  CASET!  ! ) will  have  as  its  value  the  (closure  of  the)  CLANBDA-expression . 
This  allows  for  the  creation  of  a KNOWN-FUNCTION  property,  etc.  This  analysis  is 
very  similar  to  that  performed  by  BIND-ANALYZE-RETURN  (see  below).  Aside  from 
this,  the  analysis  of  a CASET  is  simple;  the  CLOVARS  component  is  merely  the 
union  of  the  CLOVAR  slots  of  the  sub-cnodes. 
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(DEFINE  B IND- ANALYZE -CLAMBDA 

(LAMBDA  ( CNODE  FNP  NAME  CFM) 

(BLOCK  (BIND-ANAL YZE  (CLAMBDAXBOOY  CFM)  NIL  NIL) 
(ALTER-CNODE  CNODE 

(CLOVARS  :■  (IF  (EQ  FNP  'NOCLOSE) 
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(ALTER-CLAMBDA  CFM 

(FNP  :• 
( TVARS 


:•  (IF  (EQ  FNP  'NOCLOSE) 

(CNODEXCLOVARS  (CLAMBDAXBOOY  CFM)) 
( CNODE XREFS  CNODE)))) 

FNP) 

:•  (IF  (EQ  FNP  ' NOCLOSE) 

(REFD-VARS  (CLAMBDAXVARS  CFM)) 
NIL)) 

• (OR  NAME  (GENTEMP  *F))))))) 


(DEFINE  BIND-ANAL YZE -CONTINUATION 

(LAMBDA  (CNODE  FNP  NAME  CFM) 

(BLOCK  (BIND-ANAL YZE  (CONTINUATIONXBOOY  CFM)  NIL  NIL) 

(ALTER-CNOOE  CNODE 

(CLOVARS  :•  (IF  (EQ  FNP  'NOCLOSE) 

(CNOOEXCLOVARS  (CONTINUATIONXBOOY  CFM)) 


(ALTER -CONTINUATION  CFM 

(FNP  :• 
(TVARS 


(CNOOE XREFS  CNODE)))) 

FNP) 

:*  (IF  (EQ  FNP  ' NOCLOSE) 

(REFD-VARS  (LIST  (CONTINUATIONXVAR  CFM))) 
NIL)) 

• (OR  NAME  (GENTEMP  *C))))))) 


(DEFINE  BIND-ANALYZE-CIF 

(LAMBDA  (CNODE  CFM) 

(BLOCK  (BIND-ANALYZE  (CIFXPREO  CFM)  NIL  NIL) 

(BIND-ANAL YZE  (CIFXCON  CFM)  NIL  NIL) 

(BIND-ANALYZE  (CIFXALT  CFM)  NIL  NIL) 

(ALTER-CNODE  CNOOE 

(CLOVARS  :•  (UNION  (CNODEXCLOVARS  (CIFXPREO  CFM)) 

(UNION  (CNODEXCLOVARS  (CIFXCON  CFM)) 

(CNOOEXCLOVARS  (CIFXALT  CFM))))))))) 

(DEFINE  BIND-ANAL YZE-CASET 
(LAMBDA  (CNOOE  CFM) 

(LET  ( (CN  (CASETXCONT  CFM)) 

(VAL  (CASETXBOOY  CFM))) 

(BIND-ANALYZE  CN  'NOCLOSE  NIL) 

(COND  ((AND  (EQ  (TYPE  (CNOOEXCFORM  CN))  'CONTINUATION) 

(EQ  (TYPE  (CNOOEXCFORM  VAL))  'CLAMBDA)) 

(LET  ((VAR  (CONTINUATIONXVAR  (CNOOEXCFORM  CN)))) 

(PUTPROP  VAR  VAL  ’KNOWN-FUNCTION) 

(BIND-ANALYZE  VAL 

(AND  (NOT  (GET  VAR  ' VARIABLE -REFP)  ) 

(IF  (MEMQ  VAR 

(CNOOEXCLOVARS 
(CONTINUATIONXBOOY 
(CNOOEXCFORM  CN)))) 

'EZCLOSE 

(BLOCK  ( ALTER -CONT I NUAT ION  (CNOOEXCFORM  CN) 

(TVARS  NIL)) 

’NOCLOSE))) 

NIL))) 

(T  (BIND-ANALYZE  VAL  NIL  NIL))) 

(ALTER-CNOOE  CNOOE 

(CLOVARS  :•  (UNION  (CNOOEXCLOVARS  CN) 

(CNOOEXCLOVARS  VAL))))))) 
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The  binding  analysis  of  a CLABELS  is  very  tricky  because  of  the 
possibility  of  mutually  referent  functions.  For  example,  suppose  a single 
CLABELS  binds  two  CLAMBDA  expressions  with  names  F00  and  BAR.  Suppose  that  the 
body  of  FOO  refers  to  BAR,  and  that  of  BAR  to  F00.  Should  FOO  and  BAR  be  of  FNP- 
type  NIL,  EZCLOSE,  or  NOCLOSE?  If  either  is  of  type  EZCLOSE,  then  the  other  must 
be  also;  but  the  decision  cannot  be  made  sequentially.  It  is  even  more 
complicated  if  one  must  be  of  type  NIL. 

An  approximate  solution  is  used  here,  to  prevent  having  to  solve 
complicated  simultaneous  constraints.  It  is  arbitrarily  decreed  that  all 
functions  of  a single  CLABELS  shall  all  have  the  same  FNP  type.  If  any  one  must 
be  of  type  NIL,  then  they  all  are.  Otherwise,  it  is  tentatively  assumed  that 
they  all  may  be  of  type  NOCLOSE.  If  this  assumption  is  disproved,  then  the 
analysis  is  retroactively  patched  up. 

The  outer  DO  loop  of  BIND-ANALYZE-CLABELS  creates  KNOWN -FUNCTION 
properties,  and  determines  (in  the  variable  EZ)  whether  any  of  the  labelled 
functions  needs  a full  closure  structure.  (This  can  be  done  before  analyzing  the 
functions,  because  it  is  determined  entirely  by  the  VARIABLE-REFP  properties 
created  in  the  previous  phase.)  The  inner  DO  loop  then  analyzes  the  functions. 
When  this  is  done,  if  EZ  is  NOCLOSE,  and  it  turns  out  that  it  should  have  been 
EZCLOSE  after  all,  then  the  third  DO  loop  forcibly  patches  the  CLAMBDA  cnodes  for 
the  labelled  functions,  and  the  AMAPC  form  creates  LABELS- FUNCTION  properties  as 
a flag  for  the  code  generator. 


BIND-ANALYZE-RETURN  simply  analyzes  the  continuation  and  return  value 
recursively,  and  then  merges  to  two  CLOVARS  sets  to  produce  its  own  CLOVARS  set. 
A special  case  is  when  the  two  sub-cnodes  are  respectively  a CONTINUATION  and  a 
CLAMBDA;  then  special  work  is  done  because  it  is  known  that  the  variable  bound 
by  the  CONTINUATION  will  always  denote  the  (closure  of  the)  CLAMBDA-expression . 
A nasty  trick  is  that  if  it  turns  out  that  the  CLAMBDA  can  be  of  type  NOCLOSE, 
then  the  TVARS  slot  of  the  CONTINUATION  is  forcibly  set  to  NIL  (i.e.  the  empty 
set).  This  is  because  no  argument  will  really  be  passed.  (This  fact  is  also 
known  by  the  LAMBDACATE  routine  in  the  code  generator.) 
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(0EF1NE  B 1 NO -ANALYZE -CL ABEL 5 
(LAMBDA  (CNODE  CFM) 

(BLOCK  (BIND-ANAL YZE  (CLABELSNBOOY  CFM)  NIL  NIL) 

(00  ((V  ( CL ABELS \FNVARS  CFM)  (COR  V)) 

(0  (CLABELSNFNDEFS  CFM)  (COR  0)) 

(EZ  ’NOCLOSE  (AND  (NULL  (GET  (CAR  V)  'VARIABLE -REFP) ) EZ))) 
((NULL  V) 

(ALTER-CLABELS  CFM  (EASY  EZ)) 

(00  ((V  (CLABELS\F NVARS  CFM)  (COR  V)) 

(0  (CLABELSNFNDEFS  CFM)  (COR  0)) 

(CV  ( CNODE \CLOVARS  (CLABELSNBOOY  CFM)) 

(UNION  CV  (CNOOENCLOVARS  (CAR  0))))) 

((NULL  0) 

( ALTER-CNOOE  CRODE  (CLOVARS  :■=  CV)) 

(CONO  ((AND  EZ  (INTERSECT  CV  ( LABELSNFNVARS  CFM))) 

(DO  ((0  (CLABELSNFNDEFS  CFM)  (COR  0)) 

(CV  (CNOOENCLOVARS  (CLABELSNBOOY  CFM)) 

(UNION  CV  (CNOOENCLOVARS  (CAR  D))))) 

((NULL  0) 

(ALTER-CNOOE  CNODE  (CLOVARS  :«  CV))) 
(ALTER-CLAMBOA  (CNOOENCFORM  (CAR  0)) 

(FNP  :•  ' EZCLOSE  ) 

( TVARS  :•  NIL)) 

(ALTER-CNOOE  (CAR  D) 

(CLOVARS  :■  (CNODENREFS  (CAR  D) ) ) ) ) 
( AMAPC  (LAMBDA  (V)  (PUTPROP  V T ' LABELS- FUNCT ION ) ) 
(CLABELSNF NVARS  CFM)) 

(ALTER-CLABELS  CFM  (EASY  :*  ‘EZCLOSE))))) 
(BIND-ANALYZE  (CAR  0)  EZ  (CAR  V)))) 

(PUTPROP  (CAR  V)  (CAR  0)  ’KNOWN-FUNCTION))))) 


(DEFINE  BIND- ANALYZE  - RE  TURN 


(LAMBDA  (CNODE  CFM) 

(LET  ((CN  (RETURNNCONT  CFM)) 

(VAL  ( RETURNNVAL  CFM))) 

(BINO-ANALYZE  CN  'NOCLOSE  NIL) 

(COND  ((AND  (EO  (TYPE  (CNOOENCFORM  CN))  'CONTINUATION) 

(EQ  (TYPE  (CNOOENCFORM  VAL))  'CLAMBOA)) 

(LET  ((VAR  ( CONTI NUATIONNVAR  (CNODE NCFORM  CN)))) 
(PUTPROP  VAR  VAL  ’KNOWN-FUNCTION) 

(BINO-ANALYZE  VAL 

(AND  (NOT  (6ET  VAR  'VAR I ABLE -REFP)) 

( IF  (MEMO  VAR 

(CNOOENCLOVARS 
(CONTINUATIONNBOOY 
(CNOOENCFORM  CN)))) 


'EZCLOSE 

(BLOCK  (ALTER -CONTINUATION  (CNOOENCFORM  CN) 

(TVARS  :•  NIL)) 

'NOCLOSE))) 

WILD) 

(T  (BINO-ANALYZE  VAL  NIL  NIL))) 

(ALTER-CNOOE  CNOOE 

(CLOVARS  :•  (UNION  (CNOOENCLOVARS  CN) 

(CNOOENCLOVARS  VAL))))))) 
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BIND-ANALYZE-CCOMBINATION  first  analyzes  the  function  position  of  the 
combination.  It  then  distinguishes  three  cases:  a trivial  function,  a CLAMBDA- 
expression  function,  and  all  others. 

In  the  case  of  a trivial  function,  the  continuation  (which  is  the  second 
item  in  ARGS)  can  be  analyzed  with  FNP  = NOCLOSE,  because  the  compilation  will 
essentially  turn  into  "calculate  all  other  arguments,  apply  the  trivial  function, 
and  then  give  the  result  to  the  continuation".  A CCONBINATION  which  looks  like: 

(a-trivial-function  (CONTINUATION  (var)  ...)  argl  ...  argn) 
is  compiled  almost  as  if  it  were: 

((CONTINUATION  (var)  ...)  (a-trivial-function  argl  ...  argn)) 
and  of  course  the  continuation  can  be  treated  as  of  type  NOCLOSE. 

In  the  case  of  a CLAMBDA-expression,  the  arguments  are  all  analyzed,  and 
then  the  AMAPC  expression  goes  back  over  the  TVARS  list  of  the  CLAMBDA,  and 
removes  from  the  TVARS  set  each  variable  corresponding  to  an  argument  which  the 
analysis  has  proved  to  be  a NOCLOSE-type  KNOWN- FUNCTION.  This  is  because  no 
actual  argument  will  be  passed  at  run  time  for  such  a function,  and  so  there  is 
no  need  to  allocate  a register  through  which  to  pass  that  argument. 

In  the  third  case,  the  arguments  are  analyzed  straightforwardly  by  BIND- 
CCOMBINATION-ANALYZE. 


BIND-CCOMBINATION-ANALYZE  does  the  dirty  work  of  analyzing  arguments  of  a 
CCOMB INAT ION  and  updating  the  CLOVARS  slot  of  the  COMBINATION  cnode.  If  VARS  is 
non-NIL,  then  it  is  the  variables  of  the  CLAMBDA-expression  which  was  in  the 
function  position  of  the  CCOMBINATION.  As  the  arguments  are  analyzed,  KNOWN- 
FUNCTION  properties  are  put  on  the  variables  as  appropriate,  and  the  correct 
value  of  FNP  is  determined  for  the  recursive  call  to  BIND-ANALYZE.  If  VARS  is 
NIL,  then  this  code  depends  on  the  fact  that  (CDR  NIL)=NIL  in  NacLISP. 
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(DEFINE  BIND -ANALYZE -CCOMB I NAT  ION 
(LAMBDA  (CNODE  CFM) 

(LET  ((ARGS  (CCOMB INAT ION\ARGS  CFM))) 

(BIND-ANALYZE  (CAR  ARGS)  'NOCLOSE  NIL) 

(LET  ((FN  (CNOOEVCFORM  (CAR  ARGS)))) 

(COND  ((AND  (EQ  (TYPE  FN)  ’TRIVIAL) 

( EQ  (TYPE  (NOOENFORM  ( TR IVIAL VNOOE  FN))) 

•VARIABLE) 

(TRIVFN  (VARIABLEWAR  (NOOENFORM  ( TRIVIAL  NNODE  FN))))) 
(BIND-ANALYZE  (CADR  ARGS)  'NOCLOSE  NIL) 

( BIND -CCOMB I NAT  ION -ANALYZE  CNODE 

(CDDR  ARGS) 

NIL 

( CNODE NCLOVARS  (CAOR  ARGS)))) 

((EQ  (TYPE  FN)  'CLAMBDA) 

( B IND -CCOMB I NAT  ION -ANALYZE  CNODE 

(CDR  ARGS) 

(CLAMBDANVARS  FN) 

(CNODE NCLOVARS  (CAR  ARGS))) 

( AMAPC  (LAMBDA  (V) 

(IF  (LET  ((KFN  (GET  V 'KNOWN-FUNCTION))) 

(ANO  KFN 

(EQ  (EQCASE  (TYPE  ( CNODE NCFORM  KFN)) 
(CLAMBDA 
(CLAMBDANFNP 
(CNODE NCFORM  KFN))) 
(CONTINUATION 
( CONT INUAT IONNF  NP 
(CNODE NCFORM  KFN)))) 

' NOCLOSE ) ) ) 

(ALTER-CLAMBDA 

FN 

( TVARS  (DELQ  V ( CL AMBDANTVARS  FN)))))) 
(CLAMBOANTVARS  FN))) 

(T  ( B IND-CCOMB INAT JON -ANAL YZE  CNODE 

(COR  ARGS) 

NIL 

(CNODENCLOVARS  (CAR  ARGS))))))))) 

;;;  VARS  MAY  BE  NIL  - WE  DEPENO  ON  (CDR  NIL)‘NIL. 


(DEFINE  BIND -CCOMB  INAT  ION -ANAL  YZE 

(LAMBDA  (CNODE  ARGS  VARS  FCV) 

(00  ((A  ARGS  (COR  A)) 

(V  VARS  (CDR  V)) 

(CV  FCV  (UNION  CV  (CNOOENCLOVARS  (CAR  A))))) 

((NULL  A) 

(ALTER-CNODE  CNOOE  (CLOVARS  :•  CV))) 

(COND  ((AND  VARS 

(MEMO  (TYPE  (CNODENCFORM  (CAR  A)))  '(CLAMBDA  CONTINUATION)) 
(NOT  (GET  (CAR  V)  ' WRI TE-REFS ) ) ) 

(PUTPROP  (CAR  V)  (CAR  A)  'KNOWN-FUNCTION) 

(BINO-ANALYZE  (CAR  A) 

(AND  (NOT  (GET  (CAR  V)  ' VARIABLE -REFP ) ) 

(IF  (MEMO  (CAR  V)  FCV) 

•EZCLOSE 

•NOCLOSE)) 

NIL)) 

(T  (BIND-ANALYZE  (CAR  A)  NIL  NIL)))))) 
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DEPTH-ANALYZE  allocates  registers  through  which  to  pass  arguments  to 
NOCLOSE  functions,  i.e.  for  arguments  corresponding  to  elements  of  TVARS  sets. 

An  unclever  stack  discipline  is  used  for  allocating  registers.  Each  function  is 
assigned  a "depth",  which  is  zero  for  a function  whose  FNP  is  NIL  or  EZCLOSE 
(such  functions  take  their  arguments  in  the  standard  registers  **ONE**  through 
**EICiHT**,  assuming  that  **NUMBER-OF-ARG-REGS**  is  8,  as  it  is  in  the  current 
SCHEME  implementation).  For  a NOCLOSE  function  the  depth  is  essentially  the 
depth  of  the  function  in  whose  body  the  NOCLOSE  function  appears,  plus  the  number 
of  TVARS  belonging  to  that  other  function  (if  it  is  of  type  NOCLOSE)  or  the 
number  of  standard  argument  registers  used  by  it  (if  it  is  NIL  or  EZCLOSE).  For 
example,  consider  this  code: 

(CLAMBDA  (C  X Y) 

((CLAMBDA  (K  F Z) 

((CLAMBDA  (Q  W V) 

• ••) 

CONT-57  *3  '4 ) ) 

(CONTINUATION  (V)  ...)  \ 

(CLAMBDA  (H)  ...) 

•FOO)) 

Suppose  that  the  outer  CLAMBDA  is  of  type  EZCLOSE  for  some  reason.  Its  depth  is  j 

0.  The  two  CLAMBDA-expressions  and  CONTINUATION  immediately  within  it  have  depth 
3 (assuming  the  CONTINUATION  and  second  CLAMBDA  are  of  type  NOCLOSE  — the  first 
CLAMBDA  definitely  is).  The  innermost  CLAMBDA  is  then  of  depth  4 (for  Z,  which 
will  be  in  TVARS  --  K and  F will  not  be  because  they  are  names  for  NOCLOSE 
functions,  assuming  K and  F have  no  WRITE-REFS  properties). 

To  each  function  is  also  attached  a MAXDEP  value,  which  is  in  effect  the 
number  of  registers  used  by  that  function,  including  all  NOCLOSE  functions  within 
it.  This  is  used  in  only  one  place  in  the  code  generator,  to  generate  a SPECIAL 
declaration  for  the  benefit  of  the  MacLISP  compiler,  which  compiles  the  output  of 
RABBIT.  For  most  constructs  this  is  simply  the  numerical  maximum  over  the  depths 
of  all  sub-cnodes.  Toward  this  end  the  maximum  depth  of  the  cnode  is  returned  as 
the  value  of  DEPTH-ANALYZE. 


* 


001 

002 

003 

004 

005 

006 
007 
006 

009 

010 

011 

012 

013 

014 

015 

016 
017 
016 

019 

020 
021 
022 

023 

024 

025 

026 

027 

028 

029 

030 

031 

032 

033 

034 

035 

036 

037 
0 38 

039 

040 

041 

042 

043 

044 

045 

046 

047 
046 

049 

050 

051 

052 

053 

054 

055 

056 

057 
056 
059 


05/15/78  Page  38 


;;;  FOR  EACH  CLAMBDA  AND  CONTINUATION  WE  FILL  IN: 

;;;  OEP  DEPTH  OF  TEMP  VAR  USAGE  AT  THIS  POINT 

;;;  MAXOEP  MAX  DEPTH  BELOW  THIS  POINT 

;;;  VALUE  OF  DEPTH- ANALYZE  IS  THE  MAX  DEPTH 

(DEFINE  DEPTH- ANALYZE 

(LAMBDA  (CNOOE  OEP) 

(LET  ( (CFM  ( CNOOE \CFORM  CNODE ) ) ) 

(EQCASE  (TYPE  CFM) 

(TRIVIAL  DEP) 

(CVARIABLE  DEP) 

(CLAMBDA 

(LET  ((MD  (DEPTH- ANALYZE  (CLAMBDAXBOOY  CFM) 

(IF  (EQ  (CLAMBOAXFNP  CFM)  1 NOCLOSE) 

(♦  DEP  (LENGTH  ( CL AMBDA\TVARS  CFM))) 
(MIN  (LENGTH  (CLAMBDANVARS  CFM)) 

(♦  1 *»NUMBER-OF  -ARG-REGS**  )))))) 

(ALTER -CLAMBDA 
CFM 

(DEP  :»  (IF  (EQ  (CLAMBOA\FNP  CFM)  • NOCLOSE)  DEP  0)) 

(MAXOEP  :*  MO)) 

MO)) 

(CONTINUATION 
(LET  ((MD  (DEPTH- ANALYZE 

(CONT INUAT IONXBODY  CFM) 

(IF  (EQ  (CONT INUAT ION\FNP  CFM)  ’NOCLOSE) 

(+  DEP  (LENGTH  (CONT INUAT IONXTVARS  CFM))) 

*))>> 

(ALTER -CONT INUAT ION 
CFM 

(DEP  :«  (IF  (EQ  (CONT INUAT EON\FNP  CFM)  ’NOCLOSE)  DEP  0)) 
(MAXOEP  :»  MD)) 

MD)) 

(CIF 

(MAX  (DEPTH-ANALYZE  (CIF\PRED  CFM)  DEP) 

(DEPTH- ANALYZE  (CIFXCON  CFM)  DEP) 

(DEPTH-ANALYZE  (CIFXALT  CFM)  DEP))) 

(CASET 

(MAX  (DEPTH-ANALYZE  (CASEACONT  CFM)  DEP) 

(DEPTH-ANALYZE  (CASETXBODY  CFM)  DEP))) 

(CLABELS 

(LET  ((DP  (IF  (EQ  (CLABELSXEASY  CFM)  ’NOCLOSE) 

OEP 

( + DEP  (LENGTH  (CLABELSXF NVARS  CFM)))))) 

(DO  ( (0  (CLABELSXFNDEFS  CFM)  (COR  D)) 

(MD  (DEPTH-ANALYZE  (CLABELSXBOOY  CFM)  OP) 

(MAX  MD  (DEPTH-ANALYZE  (CAR  D)  DP)))) 

((NULL  D)  MD)))) 

( CCOMB INAT ION 

(DO  ((A  (CCOMB INAT IONXARGS  CFM)  (COR  A)) 

(MD  0 (MAX  MD  (DEPTH-ANALYZE  (CAR  A)  OEP)))) 

((NULL  A)  MD))) 

(RETURN 

(MAX  (DEPTH-ANALYZE  (RETURNXCONT  CFM)  OEP) 

(DEPTH-ANALYZE  (RETURNXVAL  CFM)  DEP))))))) 
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Just  as  DEPTH-ANALYZE  assigns  locations  in  registers  ("stack  locations”) 
for  variables,  so  CLOSE-ANALYZE  assigns  locations  in  consed  ("heap-allocated") 
environment  structures  for  variables.  The  general  idea  is  that  if  the  value  of  a 
an  accessible  variable  is  not  in  a register,  then  it  is  in  the  structure  which  is 
in  the  register  **ENV**.  This  structure  can  in  principle  be  any  structure 
whatsoever,  according  to  the  whim  of  the  compiler.  RABBIT'S  whim  is  to  be  very 
unclever;  the  structure  of  **ENV**  is  always  a simple  list  of  variable  values. 
Thus  a variable  in  the  **ENV**  structure  is  always  accessed  by  a series  of  CDR 
operations  and  then  one  CAR  operation. 

(More  clever  would  be  to  maintain  the  environment  as  a chained  list  of 
vectors,  each  vector  representing  a non-null  contour.  Then  a variable  could  be 
accessed  by  a series  of  "CDR"  operations  equal  to  the  number  of  contours  (rather 
than  the  number  of  variables)  between  the  binding  and  the  reference,  followed  by 
a single  indexing  operation  into  the  contour-vector.  The  number  of  "CDR" 
operations  could  be  reduced  by  having  a kind  of  "cache"  for  the  results  of  such 
contour  operations;  such  a cache  would  in  fact  be  equivalent  to  the  "display" 
used  in  many  Algol  implementations.  If  such  a display  were  maintained,  a 
variable  could  be  accessed  simply  by  a two-level  indexing  operation.) 

Within  the  compiler  an  environment  structure  is  also  represented  as  a 
simple  list,  with  the  name  of  a variable  occupying  the  position  which  its  value 
will  occupy  in  the  run-time  environment. 

For  every  CLAMBDA,  CONTINUATION,  and  CLABELS,  a slot  called  CONSENV  is 
filled  in,  which  is  a list  representing  what  the  environment  structure  will  look 
like  when  the  closure(s)  for  t»*jt  construct  are  to  be  constructed,  if  any.  This 
is  done  by  walking  over  the  crode-tree  and  doing  to  the  environment 
representation  precisely  what  will  be  done  to  the  real  environment  at  run  time. 

There  is  a problem  with  the  possibility  that  a variable  may  initially  be 
in  a register  (because  it  was  passed  as  an  argument,  for  example),  but  must  be 
transferred  to  a consed  environment  structure  because  the  variable  is  referred  to 
by  the  code  of  a closure  to  be  constructed.  There  are  two  cases:  either  the 
variable  has  no  WRITE-REFS  property,  or  it  does. 

If  it  does  not,  then  there  is  no  problem  with  the  value  of  the  variable 
being  in  two  or  more  places,  so  it  is  s imply  copied  and  consed  into  the 
environment  as  necessary.  The  CLOSEREFS  slot  of  a function  is  a list  of  such 
variables  which  must  be  added  to  the  consed  environment  before  constructing  the 
closure . 


If  the  variable  does  have  WRITE-REFS,  then  the  value  of  the  variable  must 
have  a single  "home",  to  prevent  inconsistencies  when  it  is  altered.  (This  is 
far  easier  than  arranging  for  every  ASF.T'  operation  to  update  all  extant  copies 
of  a variable's  value.)  It  is  arranged  that  such  variables,  if  they  are  referred 
to  be  closures  (are  in  the  CLOVARS  set  of  the  CLAMBDA  which  binds  them)  will 
exist  only  in  the  consed  environment.  Thus  for  each  CLAMBDA  the  ASETVARS  set  is 
that  subset  of  the  lambda  variables  which  have  WRITE-REFS  and  are  in  the  CLOVARS 
set.  Before  the  body  of  the  CLAMBDA  is  executed,  a piece  of  code  inserted  by  the 
code  generator  will  transfer  the  variables  from  their  registers  immediately  into 
the  consed  environment,  and  the  values  in  the  registers  are  thereafter  never 
referred  to. 


;;;  CLOSURE  ANALYSIS  FOR  CPS  VERSION 
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FOR  EACH  CLAMBOA,  CONTINUATION.  AND  CLABELS  WE  I ILL  IN: 

CONSENV  THE  CONSEO  ENVIRONMENT  OF  THE  CLAMBOA. 

CONTINUATION,  OR  CLABELS  (BEFORE  ANY 
CLOSEREFS  HAVE  BEEN  CONSEO  ON) 

FOR  EACH  CLAMBOA  AND  CONTINUATION  WE  FILL  IN: 

CLOSEREFS  A LIST  OF  VARIABLES  REFERENCED  BY  THE  CLAMBOA 

OR  CONTINUATION  WHICH  ARE  NOT  IN  THE  CONSEO 
ENVIRONMENT  AT  THE  POINT  OF  THE  ClAHBDA  OR 
CONTINUATION  ANO  SO  MUST  BE  CONSEO  ONTO  THE 
ENVIRONMENT  AT  CLOSURE  TIME;  HOWEVER,  THESE 
NEEO  NOT  BE  CONSEO  ON  IF  THE  CLAMBOA  OR 
CONTINUATION  IS  IN  FUNCTION  POSITION  OF 
A FATHER  WHICH  IS  A CCOMBINATION  OR  RETURN 
FOR  THE  CLAMBOA'S  IN  THE  FNDEFS  OF  A CLABELS.  THESE  MAY  BE 
SLIGHTLY  ARTIFICIAL  FOR  THE  SAKE  OF  OPTIMIZATION  (SEE  BELOW). 

FOR  EACH  CLAMBOA  WE  FILL  IN: 

ASETVARS  A LIST  OF  THE  VARIABLES  BOUND  IN  THE  CLAMBOA 

WHICH  ARE  EVER  ASET  AND  SO  MUST  BE  CONSEO 
ONTO  THE  ENVIRONMENT  IMMEDIATELY  IF  ANY 
CLOSURES  OCCUR  IN  THE  BOOY 
FOR  EACH  CLABELS  WE  FILL  IN: 

FNENV  VARIABLES  TO  BE  CONSEO  ONTO  THE  CURRENT  CONSENV 

BEFORE  CLOSING  THE  LABELS  FUNCTIONS 


;;;  CENV  IS  THE  CONSEO  ENVIRONMENT  (A  LIST  OF  VARIABLES) 

(OEFINE  F ILTER-CLOSEREFS 

(LAMBOA  (REFS  CENV) 

(DO  ((X  REFS  (COR  X)) 

(Y  NIL 

(IF  (OR  (MEMO  (CAR  X)  CENV) 

(LET  ( ( KFN  (GET  (CAR  X)  ’KNOWN-FUNCTION))) 

(ANO  KFN 

(EQ  (EQCASE  (TYPE  (CNOOEXCFORM  KFN)) 

(CLAMBOA 

(CLAMBDAXFNP  (CNOOEXCFORM  KFN))) 
(CONTINUATION 

(CONTI NUAT IONXF NP  (CNOOEXCFORM  KFN)))) 
' NOCLOSE)))) 

Y 

(CONS  (CAR  X)  Y)))) 

((NULL  X)  (NRE VERSE  Y))>)> 


L 
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For  each  CLABELS  a set  called  FNENV  is  computed.  This  is  strictly  an 
efficiency  hack,  which  attempts  to  arrange  it  such  that  the  several  closures 
constructed  for  a CLABELS  share  environment  structure.  The  union  over  all  the 
variables  needed  is  computed,  and  these  variables  are,  at  run  time,  all  consed 
onto  the  environment  before  any  of  the  closures  is  constructed.  The  hope  is  that 
the  intersection  of  these  sets  is  large,  so  that  the  total  environment  consing  is 
less  than  if  a separate  environment  were  consed  for  each  labelled  closure. 


FILTER-CLOSEREFS  is  a utility  routine  which,  given  a set  of  variables  and 
an  environment  representation,  returns  that  subset  of  the  variables  which  are  not 
already  in  the  environment  and  so  do  not  denote  known  NOCLOSE  functions.  (Those 
variables  which  are  already  in  the  consed  environment  or  which  do  denote  NOCLOSE 
functions  of  course  need  not  be  added  to  that  consed  environment.) 


The  argument  CENV  to  CLOSE-ANALYZE  is  the  representation  of  the  consed 
environment  (in  **ENV**)  which  will  be  present  when  the  code  for  CNODE  is 
executed.  The  only  processing  of  interest  occurs  for  CLAHBDA,  CONTINUATION,  and 
CLABELS  cnodes. 

The  CLOSEREFS  of  a CLANBDA  are  those  which  are  referred  to  by  the  CLAHBDA 
and  which  are  not  already  in  CENV,  provided  the  CLAHBDA  is  not  of  type  NOCLOSE. 
The  ASETVARS  are  precisely  those  VARS  which  have  WRITE-REFS  and  are  in  CLOVARS. 

The  processing  for  a CONTINUATION  is  similar.  As  a consistency  check,  we 
make  sure  the  bound  variable  has  no  WRITE-REFS  (it  should  be  Impossible  for  an 
ASET’  to  refer  to  the  bound  variable  of  a CONTINUATION). 

For  a CLABELS,  the  FNENV  set  is  first  calculated  and  added  to  CENV.  This 
new  CENV  is  then  used  to  process  the  definitions  and  body  of  the  CLABELS. 
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(DEFINE  CLOSE -ANALYZE 

(LAMBDA  ( CNOOE  CENV) 

(LET  ( ( CFM  (CNOOEVCFORM  CNODE ) ) ) 

( EQCASE  (TYPE  CFM) 

(TRIVIAL  NIL) 

(CVARIABLE  NIL) 

(CLAMBOA 

(LET  ( (CR  (AND  (NOT  (EQ  (CLAMBOA\FNP  CFM)  ’NOCLOSE)) 

(FILTER-CLOSEREFS  (CNOOEVREFS  CNOOE)  CENV))) 

(AV  (00  ((V  (CLAMBOAWARS  (CNOOEVCFORM  CNODE))  (COR  V)) 

(A  NIL  (IF  (AND  (GET  (CAR  V)  'WRITE-REFS) 

(MEMO  (CAR  V) 

( CNOOE \CLOVARS 
(CLAMBOANBOOY  CFM)))) 

(CONS  (CAR  V)  A) 

A))) 

((NULL  V)  A)))) 

(ALTER-CLAMBDA  CFM 

(CONSENV  :»  CENV) 

(CLOSEREFS  :=  CR) 

( ASETVARS  :*  AV)) 

(CLOSE -ANALYZE  (CLAMBDAVBODY  CFM) 

(APPEND  AV  CR  CENV)))) 

(CONTINUATION 

(AND  (GET  ( CONT I NUAT ION\VAR  CFM)  ’WRITE-REFS) 

(ERROR  ' |How  could  an  ASET  refer  to  a continuation  variable?! 
CNOOE 

■FAIL-ACT)) 

(LET  ((CR  (AND  (NOT  (EQ  (CONTINUATION\FNP  CFM)  'NOCLOSE)) 

(FILTER-CLOSEREFS  (CNOOEVREFS  CNOOE)  CENV)))) 

(ALTER -CONT I NUAT  ION  CFM 

(CONSENV  :»  CENV) 

(CLOSEREFS  :*  CR)) 

(CLOSE -ANALYZE  (CONTINUATION\BOOY  CFM) 

(APPEND  CR  CENV)))) 

(CIF 

(CLOSE -ANALYZE  (CIFNPRED  CFM)  CENV) 

(CLOSE-ANAL YZE  (C!F\CON  CFM)  CENV) 

(CLOSE -ANALYZE  (C!F\ALT  CFM)  CENV)) 

(CASET 

(CLOSE- ANAL YZt  (CASET\CONT  CFM)  CENV) 

(CLOSE -ANALYZE  (CASET\BODY  CFM)  CENV)) 

(CLABELS 
((LAMBDA  (CENV) 

(BLOCK  ( AMAPC  (LAMBOA  (0)  (CLOSE -ANALYZE  0 CENV)) 
(CLABELSXFNDEFS  CFM)) 

(CLOSE-ANAL YZE  (CLABELSNBOOY  CFM)  CENV))) 

(COND  ((CLABELSVEASY  CFM) 

(OO  ((D  (CLABELSXFNDEFS  CFM)  (COR  D)) 

(R  NIL  (UNION  R (CNOOEVREFS  (CAR  D))))) 

((NULL  D) 

(LET  ((E  (FILTER-CLOSEREFS  R CENV))) 

(ALTER-CLABELS  CFM 

(FNENV  :•  E) 

(CONSENV  :*  CENV)) 

(APPEND  E CENV))))) 

(T  (ALTER-CLABELS  CFM 

(FNENV  :•  NIL) 

(CONSENV  :•  CENV)) 

CENV)))) 

(CCOMBINATION 

(AMAPC  (LAMBOA  (A)  (CLOSE-ANAL YZE  A CENV)) 

( CCOMB I NAT IONNARGS  CFM))) 

(RETURN 

(CLOSE-ANALYZE  (RETURNNCONT  CFM)  CENV) 

(CLOSE -ANALYZE  (RETURNWAL  CFM)  CENV)))))) 


We  now  come  to  the  code  generator,  which  is  altogether  about  one-fourth 
of  all  the  code  making  up  RABBIT.  Part  of  this  is  because  much  code  which  is 
conceptually  singular  is  duplicated  in  several  places  (partly  as  a result  of  the 
design  error  in  which  COMBINATION  and  RETURN  nodes,  or  CLAMBDA  and  CONTINUATION 
nodes,  are  treated  distinctly;  and  also  because  a powerful  text  editor  made  it 
very  easy  to  make  copies  of  the  code  for  various  purposes!).  The  rest  is  just 
because  code  generation  is  fairly  tricky  and  requires  checking  for  special  cases. 
A certain  amount  of  peephole  optimization  is  performed;  this  is  not  so  much  to 
improve  the  efficiency  of  the  output  code,  as  to  make  the  output  code  easier  to 
read  for  a human  debugging  RABBIT.  A large  fraction  of  the  output  code  (perhaps 
ten  to  twenty  percent)  is  merely  comments  of  various  kinds  intended  to  help  the 
debugger  of  RABBIT  figure  out  what  happened. 

One  problem  in  the  code  generator  is  that  most  functions  need  to  be  able 
to  return  two  things:  the  code  generated  for  a given  cnode-tree,  and  a list  of 
functions  encountered  in  the  cnode-tree,  for  which  code  is  to  generated 
separately  later.  We  solve  this  problem  by  a stylistic  trick,  namely  the 
explicit  use  of  continuation-passing  style.  Many  functions  in  the  code  generator 
take  an  argument  named  "C" . This  argument  is  itself  a function  of  two  arguments: 
the  generated  code  and  the  deferred-function  list.  The  function  which  is  given  C 
is  expected  to  compute  its  two  results  and  then  invoke  C,  giving  it  the  two 
results  as  arguments.  (In  practice  a function  which  gets  an  argument  C also  gets 
an  argument  FNS,  which  is  a deferred-functions  list;  the  function  is  expected  to 
add  its  deferred  functions  onto  this  list  FNS,  and  give  the  augmented  FNS  list  to 
C along  with  the  generated  code.) 

Other  arguments  which  are  frequently  passed  within  the  code  generator  are 
CENV  (a  representation  of  the  consed  environment);  BLOCKFNS,  a list  describing 
external  functions  compiled  together  in  this  "block"  or  "module"  (this  is  used  to 
compile  a direct  GOTO  rather  than  a more  expensive  call  to  an  external  function, 
the  theory  being  that  several  functions  might  be  compiled  together  in  a single 
module  ns  with  the  InterLISP  "block  compiler";  this  theory  is  not  presently 
implemented,  however,  and  so  BLOCKFNS  always  has  just  one  entry);  PROGNAME,  a 
symbol  which  at  run  time  will  have  as  its  value  the  MacLISP  SUBR  pointer  for  the 
current  module  (this  SUBR  pointer  is  consed  into  closures  of  compiled  functions, 
and  so  any  piece  of  code  which  constructs  a closure  will  need  to  refer  to  the 
value  of  this  symbol);  and  RNL,  the  "rename  list",  an  alist  pairing  internal 
variable  names  to  pieces  of  code  for  accessing  them  (when  code  to  reference  a 
variable  is  to  be  generated,  the  piece  of  code  in  RNL  is  used  if  the  variable  is 
found  in  RNL,  and  otherwise  a reference  to  the  variable  name  itself  (which  is 
therefore  global)  is  output). 

COMP  I LATE  is  the  topmost  routine  of  the  code  generator.  FN  is  the  cnode- 
tree  for  a function  to  be  compiled.  The  topmost  cnode  should  of  course  be  of 
type  Ciw.MBDA  or  CONTINUATION.  For  a CLAMBDA,  the  call  to  REGSLIST  sets  up  the 
initial  RNl.  (rename  list)  for  references  to  the  arguments.  Also,  when  COMP-BODY 
has  returned  the  code  (the  innermost  LAMBDA-expression  in  COMPILATE  is  the 
argument  C given  to  COMP-BODY),  SET-UP-ASETVARS  is  called  to  take  care  of  copying 
the  variables  in  the  ASETVARS  set  into  the  consed  environment.  The  code  for  a 
CONTINUATION  is  similar,  except  that  a CONTINUATION  has  no  ASETVARS  and  only  one 
bound  variable. 
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;;;  CODE  GENERATION  ROUTINES 


PROGNAME : 
FN: 

EXTERNAL? : 
RNl  : 

BLOCKFNS: 

FNS: 

C: 


NAME  OF  A VARIABLE  WHICH  AT  RUN  TIME  WILL  HAVE 
AS  VALUE  THE  SUBR  POINTER  FOR  THE  PROG 

THE  FUNCTION  TO  COMPILE  (A  CLAMBOA  OR  CONTINUATION  CNOOE) 
NON-NIL  IF  THE  FUNCTION  IS  EXTERNAL 
INITIAL  RENAME  LIST  (NON-NIL  ONLY  FOR  NOCLOSE  FNS). 
ENTRIES  ARE:  (VAR  . CODE) 

AN  AL 1ST  OF  FUNCTIONS  IN  THIS  BLOCK. 

ENTRIES  ARE:  (USERNAME  CNODE ) 

A LIST  OF  TUPLES  FOR  FUNCTIONS  YET  TO  BE  COMPILED; 

EACH  TUPLE  IS  (PROGNAME  FN  RNL) 

A CONTINUATION.  TAKING: 

CODE:  THE  PIECE  OF  MACLISP  CODE  FOR  THE  FUNCTION 

FNS:  AN  AUGMENTED  FNS  LIST 


(DEFINE  COMPILATE 

(LAMBDA  (PROGNAME  FN  RNL  BLOCKFNS  FNS  C) 

(LET  ( (CFM  (CNODENCFORH  FN))) 

(EQCASE  (TYPE  CFM) 

(CLAMBOA 

(LET  ( (CENV  (APPEND  ( CL AMBD A\ ASE TVARS  CFM) 

(CLAMBDAXCLOSEREFS  CFM) 

(CLAMBDAXCONSENV  CFM)))) 

(COHP-BOOY  (CLAMBDAXBOOY  CFM) 

(REGSLIST  CFM  T (ENVCARCDR  CENV  RNL)) 

PROGNAME 

BLOCKFNS 

CENV 

FNS 

(LAMBDA  (CODE  FNS) 

(C  (SET -UP -ASE TVARS  CODE 

( CL AMBD AXASE TVARS  CFM) 
(REGSLIST  CFM  NIL  NIL)) 

FNS))))) 

(CONTINUATION 

(LET  ((CENV  (APPEND  (CONTINUATIONXCLOSEREFS  CFM) 

(CONTINUATIONXCONSENV  CFM)))) 

(COMP-BODY  (CONTINUATIONXBOOY  CFM) 

(IF  (EQ  (CONTINUATIONXFNP  CFM)  'NOCLOSE) 

(IF  (NULL  ( CONT I NUAT IONXTVARS  CFM)) 

(ENVCARCDR  CENV  RNL) 

(CONS  (CONS  ( CONT I NUAT IONXVAR  CFM) 

(TEMPLOC  ( CONT I NUAT IONXDEP  CFM))) 
(ENVCARCDR  CENV  RNL))) 

(CONS  (CONS  ( CONT INUATIONXVAR  CFM) 

(CAR  **AR6UMENT-RE6ISTERS**)) 
(ENVCARCDR  CENV  RNL))) 


PROGNAME 

BLOCKFNS 

CENV 

FNS 


C)»>») 
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♦★ARGUMENT-REGISTERS**  is  a list  of  the  standard  "registers"  through 
which  arguments  are  passed.  In  the  standard  SCHEME  implementation  this  list  is: 

( **ONE**  **TWO**  **THREE**  **FOUR** 

**F IVE**  **SIX**  **SEVEN**  **EIGHT**) 


DF.PROGNIFY1  is  a peephole  optimizer.  It  takes  a MacLISP  form  and  returns 
a list  of  MacLISP  forms.  The  idea  is  that  if  the  given  form  is  (PROGN  . ..),  the 
keyword  PROGN  is  stripped  off;  also,  any  irrelevant  computations  (references  to 
variables  or  constants  other  than  in  the  final  position)  are  removed. 
(ATOMFLUSUP,  when  NIL,  suppresses  the  removal  of  symbols,  which  in  some  cases  may 
be  MacLISP  PROG  tags).  The  purpose  of  this  is  to  avoid  multiple  nesting  of  PROGN 
forms : 

(PROGN  (PROGN  a b)  (PROGN  (PROGN  c (PROGN  d e)  f)  g)) 

Any  code  generation  routine  which  constructs  a PROGN  with  a component  Q generated 
by  another  routine  generally  says: 

"(PROGN  ( SETQ  FOO  3)  <?( DEPROGNIFY  Q)  (GO  .THE-TAG)) 

The  "0"  means  that  the  list  of  forms  returned  by  the  call  to  DEPROGNIFY  (which  is 
actually  a macro  which  expands  into  a call  to  DEPROGNIFY1)  is  to  be  substituted 
into  the  list  (PROGN  ...)  being  constructed  by  the  operator.  Thus  rather 
than  the  nested  PROGN  code  shown  above,  the  code  generator  would  instead  produce: 

(PROGN  a b c d e f g) 

which  is  much  easier  to  read  when  debugging  the  output  of  RABBIT. 


TEMPLOC  is  a little  utility  which  given  the  number  (in  the  DEP  ordering 
used  by  DF.PTH-ANALYZE)  of  a register  returns  the  name  of  that  register. 
**CONT+ARG-REGS**  is  the  same  as  **ARGUMENT-REGISTERS**  except  that  the  name 
**CONT**  is  tacked  onto  the  front.  **CONT**  is  considered  to  be  register  0.  If  N 
is  greater  than  the  number  of  the  highest  standard  argument  register,  then  a new 
register  name  of  the  form  "-N-"  is  invented.  Thus  the  additional  temporary 
registers  are  called  - 1 1 - , -12-,  -13-,  etc. 
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;;;  DEPROGNIFY  IS  USEO  ONLY  TO  MAKE  THE  OUTPUT  PRETTY  BY  ELIMINATING 
;;;  UNNECESSARY  OCCURRENCES  OF  "PROGNV 

(OEFMAC  DEPROGNIFY  (FORM)  "( OEPROGNIF Y1  .FORM  NIL)) 

(SET1  *DEPROGN IF Y-COUNT*  0) 

(DEFINE  DEPROGNIFY1 

(LAMBDA  (FORM  ATOMFLUSHP ) 

(IF  (OR  (ATOM  FORM)  (NOT  (EQ  (CAR  FORM)  ' PROGN ) ) ) 

(LIST  FORM) 

(DO  ((X  (COR  FORM)  (COR  X)) 

(Z  NIL  (COND  ((NULL  (COR  X))  (CONS  (CAR  X)  Z)) 

((NULL  (CAR  X)) 

(INCREMENT  *DEPROGNIFY-COUNT* ) 

Z) 

((ATOM  (CAR  X)) 

(COND  (ATOMFLUSHP 

(INCREMENT  *DEPROGN IF Y-COUNT* ) 

Z) 

(T  (CONS  (CAR  X)  Z)))) 

((EQ  (CAAR  X)  ’QUOTE) 

(INCREMENT  «DEPROGNIFY-COUNT* ) 

Z) 

(T  (CONS  (CAR  X)  Z))))) 

((NULL  X)  ( NREVERSE  Z)))))) 

(DEFINE  TEMPLOC 

(LAMBDA  (N) 

(LABELS  ((LOOP 

(LAMBDA  (REGS  J) 

(IF  (NULL  RE6S) 

(IMPLODE  (APPEND  ’(-)  (EXPLODEN  N)  ’(-))) 

(IF  (*  J 0) 

(CAR  REGS)  ' 

(LOOP  (CDR  REGS)  (-  J 1))))))) 

(LOOP  •oCONT+ARG-REGS**  N)))) 


ENVCARCDR  takes  a set  of  variables  VARS  representing  the  consed 

environment,  and  an  old  rename  list  RNL,  and  adds  to  RNL  new  entries  for  the 

variables,  supplying  pieces  of  code  to  access  the  environment  structure.  For 

example,  suppose  RNL  were  NIL,  and  VARS  were  (A  B C).  Then  ENVCARCDR  would 

produce  the  list: 

((C  . (CAR  (CDR  (CDR  **ENV**)))) 

(B  . (CAR  (CDR  **ENV**) ) ) 

(A  . (CAR  **ENV**) ) ) 

where  each  variable  has  been  paired  with  a little  piece  of  code  which  can  be  used 
to  access  it  at  run  time.  This  example  is  not  quite  correct,  however,  because 
the  peephole  optimizer  DECARCDRATE  is  called  on  the  little  pieces  of  code; 
DECARCDRATE  collapses  CAR-CDR  chains  to  make  them  easier  to  read,  and  so  the  true 
result  of  ENVCARCDR  would  be: 

((C  . ( CADDR  **ENV**) ) 

(B  . (CADR  **ENV**) ) 

(A  . (CAR  **ENV**) ) ) 
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? (DEFINE  ENVCARCDR 
3 (LAMBDA  (VARS  RNL ) 

« (DO  ((X  ' *»ENV**  “(COR  ,X)) 

5 (V  VARS  ( CDR  V)) 

6 (R  RNL  (CONS  (CONS  (CAR  V)  (OECARCORATE  “(CAR  .X)))  R))) 

7 ((NOLL  V)  R)))) 
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REGSLIST  takes  a CLAMBDA  cnode,  a switch  AVP,  and  a rename  list  RNL.  It 
tacks  onto  RNL  new  entries  which  describe  how  to  access  the  arguments  of  the 
CLANBDA.  This  is  complicated  because  there  are  three  cases.  (1)  A NOCLOSE 
function  takes  its  arguments  in  non-standard  registers.  (Z)  Other  functions  of 
not  more  than  **NUMBER -OF- ARGUMENT- REGISTERS**  (the  length  of  the  **ARGUMENT- 
REGISTERS**  list)  arguments  takes  their  arguments  in  the  standard  registers.  (3) 
All  other  functions  takes  a list  of  arguments  in  the  first  argument  register 
(**ONE**),  except  for  the  continuation  in  **CONT**.  The  switch  AVP  tells  whether 
or  not  the  elements  of  ASETVARS  should  be  included  (non-nil  means  do  not 
include ) . 

As  an  example,  suppose  the  CLAMBDA  is  a NOCLOSE  with  DEP  = 1Z  and  TVARS  = 
(ABC  D),  and  suppose  that  AVP  = T and  RNL  = NIL.  Then  the  result  would  be: 

( ( D . -15- ) (C  . -14-)  (B  . -13-)  (A  . -1Z-)) 

As  another  example,  suppose  the  CLAMBDA  is  of  type  EZCLOSE  with  VARS  = (1C  X Y Z) 
and  ASETVARS  = (Y),  and  suppose  that  AVP  = NIL  and  RNL  = ((A  . -1Z-)).  Then  the 
result  would  be: 

((Z  . **THREE**)  (X  . **ONE**)  (X  . **CONT**)  (A  . - 1Z- ) ) 


SET-UP-ASETVARS  takes  a piece  of  code  (the  code  for  a CLAMBDA  body),  an 
ASETVARS  set  AV,  and  a rename  list.  If  there  are  no  ASETVARS,  then  just  the  code 
is  returned,  but  otherwise  a PROGN-form  is  returned,  which  ahead  of  the  code  has 
a SETQ  which  adds  the  ASETVARS  to  the  environment.  (LOOKUPICATE  takes  a variable 
and  a RNL  and  returns  a piece  of  code  for  referring  to  that  variable.)  For 
example,  suppose  we  had: 

CODE  = (GO  FOO) 

AV  = (A  C ) 

RNL  = ((C  . -14- ) (B  . -13-)  (A  . -12-)) 

Then  SET-UP-ASETVARS  would  return  the  code: 

( PROGN  (SETQ  **ENV**  (CONS  -1Z-  (CONS  -14-  **ENV**)))  (GO  FOO)) 
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;;;  AVP  NON-NIL  MEANS  THAT  ASETVARS  ARE  TO  BE  EXCLUDED  FROM  THE  CONSED  LIST. 

(DEFINE  REGSLIST 

(LAMBDA  (CLAM  AVP  RNL ) 

(LET  ( ( AV  (AND  AVP  (CL AMBOA\ ASETVARS  CLAM)))) 

(IF  (EQ  (CLAMBOA\FNP  CLAM)  'NOCLOSE) 

(DO  ((J  (CLAMBDANDEP  CLAM)  (♦  J 1)) 

(TV  (CLAMBDAUVARS  CLAM)  (CDR  TV)) 

(R  RNL 

(IF  (MEMO  (CAR  TV)  AV) 

R 

(CONS  (CONS  (CAR  TV)  (TEMPLOC  J))  R)))) 

((NULL  TV)  R)) 

(LET  ((VARS  (CLAMBOAWARS  CLAM))) 

(IF  (>  (LENGTH  (CDR  VARS))  **NUMBER-OF - ARG-REGS** ) 

(DO  ((X  (CAR  ** ARGUMENT -REGISTERS** ) *(COR  .X)) 

(V  (COR  VARS)  (CDR  V)) 

(R  (CONS  (CONS  (CAR  VARS)  ’**CONT**)  RNL) 

(IF  (MEMO  (CAR  V)  AV) 

R 

(CONS  (CONS  (CAR  V)  (OECARCDRATE  "(CAR  .X)))  R ) ) ) ) 
((NULL  V)  R)) 

(DO  ((V  VARS  (COR  V)) 

(X  **CONT+ARG-REGS**  (CDR  X)) 

(R  RNL 

(IF  (MEMO  (CAR  V)  AV) 

R 

(CONS  (CONS  (CAR  V)  (CAR  X))  R)))) 

((NULL  V)  R))>)>))) 

(DEFINE  SET -UP -ASETVARS 

(LAMBDA  (COOE  AV  RNL) 

IF  (NULL  AV) 

'OOE 

(PR06N  (SETQ  **ENV** 

.(DO  ((A  (REVERSE  AV)  (COR  A)) 

(E  1 *«ENV**  -(CONS  .(LOOKUPICATE  (CAR  A)  RNL)  ,E))) 
((NULL  A)  E))) 

•(DEPROGNIFY  COOE))))) 
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In  the  continuation-passing  style,  functions  do  not  return  values; 
instead,  they  apply  a continuation  to  the  value.  Thus,  the  body  of  a CLAMBDA- 
expression  is  a form  which  is  not  expected  to  produce  a value.  On  the  other 
hand,  such  a form  will  have  subforms  which  do  produce  values,  for  example 
references  to  variables. 

Thus  the  forms  to  be  dealt  with  in  the  code  generator  can  be  divided  into 
those  which  produce  values  and  those  which  do  not.  Initially  the  latter  will 
always  be  attacked,  as  the  body  of  a "function";  later  the  former  will  be  seen. 
COMP-BODY  takes  a valueless  form  and  compiles  it.  The  routine  ANALYZE,  which  we 
will  see  later,  handles  valued  forms. 

COMP-BODY  instantiates  a by  now  familiar  theme:  it  simply  dispatches  on 
the  type  of  BODY  to  some  specialist  routine.  In  the  case  of  a CLABELS,  it  first 
compiles  the  body  of  the  CLABELS  (which  itself  is  valueless  if  the  CLABELS  is 
valueless,  and  so  a recursive  call  to  COMP-BODY  is  used),  and  then  goes  to 
PRODUCE-LABELS.  For  a COMBINATION  or  RETURN,  it  does  a three-way  (for  RETURN, 
two-way)  sub-dispatch  on  whether  the  function  is  a TRIVFN,  a CLAMBDA  (or 
CONTINUATION),  or  something  else. 

The  PRODUCE  series  of  routines  produce  code  for  valueless  forms. 
PRODUCE- IF  calls  ANALYZE  on  the  predicate  (which  will  produce  a value),  and  COMP- 
BODY  on  the  consequent  and  alternative  (which  produce  no  value  because  the  entire 
CIF  does  not).  The  three  pieces  of  resulting  code  are  respectively  called  PRED, 
CON,  and  ALT.  These  are  then  given  to  CONDICATE,  which  generates  a NacLISP  COND 
form  to  be  output. 


RABBIT _568  05/15/78  _P»ae  45 

RNL  JS  THE  -RENAME  LIST*:  AN  AL  1ST  DESCRIBING  HOW  TO  REFER  TO  THE  VARIABLES  IN  THE 
ENVIRONMENT.  CENV  IS  THE  CONSED  ENVIRONMENT  SEEN  BY  THE  BOOY. 


(DEF INE 
COMP -BOOY 

(LAMBDA  (BODY  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C) 

(LET  ( (CFM  (CNOOEXCFORM  BOOY))) 

(EQCASE  (TYPE  CFM) 

( C IF 

(PRODUCE -IF  BODY  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C)) 

(CASET 

(PRODUCE -ASET  BOOY  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C)) 

(CLABELS 

(OR  (EQUAL  CENV  (CLABELSXCONSENV  CFM)) 

(ERROR  • (Environment  di jAgreeinent | BOOY  ‘FAIL-ACT)) 

(LET  ( ( LCENV  (APPEND  (CLABELS\FNENV  CFM)  CENV))) 

(COMP -BOOY 
(CLABELSXBODY  CFM) 

(ENVCARCOR  LCENV  RNL) 

PROGNAME 

BLOCKFNS 

LCENV 

FNS 

(LAMBDA  (LBOO  FNS) 

(PRODUCE-LABELS  BODY  LBOD  RNL  PROGNAME  BLOCKFNS  FNS  C))))) 

(COMBINATION 

(LET  ( ( FN  (CNOOEXCFORM  (CAR  ( CCOMB I NAT ION\ARGS  CFM))))) 

(COND  ((EO  (TYPE  FN)  ‘CLAMBOA) 

( PRODUCE -LAMBDA -COMB  I NAT  ION  BOOY  RNL  PROGNAME  BLOCKFNS  CEN  FNS  C)) 
((AND  (EO  (TYPE  FN)  'TRIVIAL) 

(EO  (TYPE  (NODEXFORM  ( TRIVIALXNODE  FN)))  ‘VARIABLE) 

( TRIVFN  ( VARIABLEXVAR  (NODEXFORM  (TRIVIALXNODE  FN))))) 

( PROOUCE-TRIVFN-COMBINAT ION  BODY  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C)) 
(T  (PRODUCE -COMB  I NAT  ION  BODY  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C))))) 

(RETURN 

(LET  ((FN  (CNOOEXCFORM  ( RETURNXCONT  CFM)))) 

(IT  (EO  (TYPE  FN)  ‘CONTINUATION) 

(PRODUCE-CONTINUATION-RETURN  BOOY  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C) 
(PRODUCE-RETURN  BOOY  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C)))))))) 

(DEFINE  PRODUCE- IF 

(LAMBDA  (CNODE  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C) 

(LET  ((CFM  (CNOOEXCFORM  CNODE))) 

(ANALYZE  (CIFXPREO  CFM) 

RNL 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBDA  (PREO  FNS) 

(COMP-BOOY  (CIFXCON  CFM) 

RNL 

PROGNAME 

BLOCKFNS 

CENV 

FNS 

(LAMBDA  (CON  FNS) 

(COMP-BOOY  (CIFXALT  CFM) 

RNL 

PROGNAME 

BLOCKFNS 

CENV 

FNS 

(LAMBDA  (ALT  FNS) 

(C  (CONDICATE  PREO 
CON 
ALT) 

FNS)))))))))) 
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PROUUCE-ASET  first  calls  ANALYZE  on  the  body,  which  must  produce  a value 
(to  be  assigned  to  the  CASET  variable).  There  are  then  two  cases,  depending  on 
whether  the  CASET\CONT  is  a CONTINUATION  or  not. 

If  it  is,  then  the  body  of  the  continuation  is  compiled  (using  COMP- 
BODY),  and  then  LAMBDACATE  is  called  to  generate  the  invocation  of  the 
continuation.  The  routine  OUTPUT-ASET  generates  the  actual  MacLISP  SETQ  (or 
other  construct)  for  the  CASET  variable,  using  the  environment  location  provided 
by  LOOKUPICATE.  All  in  all  this  case  is  very  much  like  a RETURN  with  an  explicit 
CONTINUATION,  except  that  just  before  the  continuation  is  invoked  a SETQ  is  stuck 
in . 

If  the  CASET\CONT  is  not  a CONTINUATION,  then  ANALYZE  is  called  on  the 
CASET\CONT,  and  then  a piece  of  code  is  output  which  sets  **FUN**  to  the 
continuation,  **ONE**  (which  is  in  the  car  of  **ARGUMENT-REGISTERS**)  to  the 
value  of  the  body  (after  also  setting  the  CASET  variable,  using  OUTPUT-ASET),  and 
does  (RETURN  NIL),  which  is  the  SCHEME  run-time  protocol  for  invoking  a 
continuation. 


001 
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002 

(DEFINE 

003 

PRODUCE -ASET 

004 

(LAMBOA  (CNOOE  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C) 

005 

(LET  ( (CFM  (CNODEXCFORM  CNODE ) ) ) 

006 

(ANALYZE  (CASETXBOOY  CFM) 

00  7 

RNL 

OOS 

PROGNAME 

009 

BLOCKFNS 

010 

FNS 

Oil 

(LAMBOA  (BOOY  FNS) 

012 

(LET  ( (CONTCFM  (CNODE \CFORM  (CASETXCONT  CFM)))) 

1 013 

(IF  (EQ  (TYPE  CONTCFM)  'CONTINUATION) 

014 

(COMP-BODY  (CONTINUATION\BOOY  CONTCFM) 

015 

(IF  (CONTINUATIONXTVARS  CONTCFM) 

016 

(CONS  (CONS  (CAR  (CONTINUATION\TVARS  CONTCFM)) 

017 

(TEMPLOC  (CONTINUATION\OEP 

016 

CONTCFM))) 

Sl  019 

(ENVCARCOR  CENV  RNL)) 

020 

( ENVCARCOR  CENV  RNL)) 

! 021 

PROGNAME 

022 

BLOCKFNS 

023 

CENV 

024 

FNS 

025 

(LAMBOA  (COPE  FNS) 

026 

(C  (LAMBDACATE 

027 

(LIST  ( CONT 1 NUAT IONNVAR  CONTCFM)) 

| 026 

( CON T I NU AT I ON \TVARS  CONTCFM) 

> 029 

(CONTINUATIONXDEP  CONTCFM) 

030 

(LIST  (OUTPUT-ASET 

031 

( LOOKUP ICATE  (CASETWAR  CFM) 

032 

RNL) 

033 

•OOY)) 

034 

(REMARK-ON  (CASETXCONT  CFM)) 

! 035 

•*»ENV** 

036 

COPE) 

037 

FNS))) 

( 036 

(ANALYZE 

039 

(CASETXCONT  CFM) 

; 040 

RNL 

[ 041 

PROGNAME 

042 

BLOCKFNS 

043 

FNS 

044 

(LAMBOA  (CONT  FNS) 

! 045 

(C  "(PROGN  (SETO  **FUN»*  .CONT) 

046 

(SETO  .(CAR  ••ARGUMENT-REGISTERS**) 

047 

.(OUTPUT-ASET 

046 

(LOOKUPICATE  (CASETWAR  CFM) 

049 

RNL) 

050 

•OOY)) 

051 

(RETURN  NIL)) 

1 052 

FNS)))))))))) 

1 . _ 
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PRODUCE-LABELS  takes  an  already-compiled  body  LBOD.  FNENV-FIX  is  a 
(possibly  empty)  list  of  pieces  of  code  which  will  fix  up  the  consed  environment 
by  adding  the  variables  common  to  all  the  closures  to  be  made  up  (this  set  was 
computed  by  CLOSE-ANALYZE  and  put  in  the  FNENV  slot  of  the  CLABELS).  The  code 
for  this  addition  is  built  from  the  list  of  variables  by  CONS-CLOSEREFS. 

There  are  then  three  cases,  depending  on  the  type  of  closures  to  be 
constructed  (NOCLOSE,  EZCLOSE,  or  NIL).  Suppose  that  the  CLABELS  is: 

(CLABELS  ((FOO  (LAMBDA  ...)) 

(BAR  (LAMBDA  ...))) 

<body>) 

Let  us  see  roughly  what  code  is  produced  for  each  case. 

For  a NIL  type  (full  closures),  the  idea  is  merely  to  create  all  the 
closures  in  standard  form  (but  with  a null  environment),  add  them  all  to  the 
conscd  environment,  and  then  go  back  and  clobber  the  environment  portion  of  the 
closures  with  the  new  resulting  environment,  plus  any  other  variables  needed. 
Now  a standard  closure  looks  like  (CBETA  <value  of  progname>  <tag> 

<environment> ) . (At  run  time  the  value  of  the  progname  will  be  a MacLISP  SUBR 
pointer  for  the  module;  the  tag  identifies  the  particular  routine  in  the 

module.)  In  the  DO  loop,  FNS  accumulates  the  function  definitions  (to  be 

compiled  separately  later),  RP  accumulates  RPLACD  forms  for  clobbering  the 
closures,  and  CB  accumulates  constructors  of  CBETA  lists.  For  our  example,  the 
generated  code  looks  like: 

((LAMBDA  (FOO  BAR) 

(SETQ  **ENV**  (CONS  ...  (CONS  X43  **ENV***) . . . ) ) 

(RPLACD  (CDDR  BAR)  (CONS  ...  (CONS  X72  **ENV**) . . . ) ) 

(RPLACD  (CDDR  FOO)  (CONS  ...  (CONS  X69  **ENV**) . . . ) ) 

<body>) 

(LIST  'CBETA  ?-453  'FOO-TAG) 

(LIST  'CBETA  ?-453  ’BAR-TAG)) 

where  ?-453  is  the  PROGNAME  for  the  module  containing  the  CLABELS,  and  FOO-TAG 
and  BAR-TAG  are  the  tags  (whose  names  will  actually  look  like  FNVAR-91)  for  FOO 
and  BAR.  (Now  in  fact  CLOSE-ANALYZE  creates  a null  FNENV  for  type  NIL  CLABELS, 
and  so  the  first  SETQ  would  in  fact  not  appear.  However,  the  decision  as  to  the 
form  of  the  FNENV  is  only  a heuristic,  and  so  PRODUCE-LABELS  is  written  so  as  to 
be  prepared  for  any  possible  choice  of  FNENV  and  CLOSEREFS  of  individual  labelled 
functions.  In  this  way  the  heuristic  in  CLOSE-ANALYZE  can  be  freely  adjusted 
without  having  to  change  PRODUCE-LABELS.) 

For  the  EZCLOSE  case  the  "closures"  need  only  contain  environments,  not 
also  code  pointers.  A trick  is  needed  here,  however,  to  build  the  circular 
environment.  When  adding  the  labelled  functions  to  the  environment,  we  must 
somehow  cons  in  an  object;  but  we  want  this  object  to  possibly  be  the 
environment  itself!  What  we  do  instead  is  to  make  up  a list  of  the  tag,  and 
later  RPLACD  this  list  cell  with  the  environment.  The  tag  is  never  used,  but  is 
useful  for  debugging.  This  method  also  makes  the  code  very  similar  to  the  NIL 
case,  the  only  difference  being  that  the  atom  CBETA  and  the  value  of  the  PROGNAME 
are  not  conscd  onto  each  closure. 
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002 

(DEFINE 

003 

PROOUCE -LABELS 

004 

(LAMBDA  ( CNOOE 

LBOO  RNL  PROGNAME  BLOCKFNS  FNS  C) 

005 

(LET  ( (CFM  (CNOOE \CFORM  CNOOE))) 

006 

(LET  ((VARS  ( CLABELSSF NVARS  CFM)) 

007 

( OEFS  (CLABELS\FNDEFS  CFM)) 

006 

(FNENV  (CLABELSVFNENV  CFM))) 

009 

(LET  ( (FNENV-F IX  (IF  FNENV  *((SETQ  **ENV**  .(CONS-CLOSEREFS  FNENV  RNL)))))) 

010 

(EQCASE  (CLABELS\EASY  CFM) 

Oil 

(NIL 

012 

(DO  ((V  VARS  (COR  V)) 

013 

(0  OEFS  (COR  0)) 

014 

(FNS  FNS  (CONS  (LIST  PROGNAME  (CAR  0)  NIL)  FNS)) 

015 

(RP  NIL  (CONS  "(RPLACO  (CDDR  .(CAR  V)) 

016 

.(CONS-CLOSEREFS 

017 

(CLAMBOANCLOSEREFS 

016 

( CNOOE \CFORM  (CAR  0))) 

019 

RNL)) 

020 

RP)) 

021 

(CB  NIL  (CONS  ’(LIST  'CBETA  .PROGNAME  '.(CAR  V))  CB))) 

022 

((NULL  .V) 

023 

(C  ’((LAMBDA  .VARS 

024 

OFNENV-F IX 

025 

ORP 

026 

B( OEPROGN IF Y LBOO)) 

027 

KNREVERSE  CB)) 

026 

FNS)))) 

029 

(EZCLOSE 

030 

(DO  ((V  VARS  (COR  V)) 

031 

(0  OEFS  (COR  0)) 

032 

(FNS  FNS  (CONS  (LIST  PROGNAME  (CAR  0)  NIL)  FNS)) 

033 

(RP  NIL  (CONS  ’(RPLACO  .(CAR  V) 

034 

.(CONS-CLOSEREFS 

03S 

(CLAMBOANCLOSEREFS 

036 

( CNOOE \CFORM  (CAR  0))) 

037 

RNL)) 

036 

RP)) 

039 

(CB  NIL  (CONS  ’(LIST  '.(CAR  V))  CB))) 

040 

((NULL  V) 

041 

(C  ’((LAMBDA  .VARS 

042 

BFNENV-FIX 

043 

•RP 

044 

f( OEPROGN IFY  LBOO)) 

045 

|(NRE VERSE  CB)) 

046 

FNS)))) 

047 

(NOCLOSE 

046 

(C  ’(PROGN  BFNENV-FIX  0(DEPROGNIFY  LBOO)) 

049 

(00  ((V  VARS  (COR  V)) 

050 

(0  OEFS  (COR  0)) 

051 

(FNS  FNS  (CONS  (LIST  PROGNAME  (CAR  D)  RNL)  FNS))) 

052 

1 

. 

((NULL  V)  FNS)))))))))) 

l . 
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One  problem  is  that  these  "closures"  are  not  of  the  same  form  as  ordinary 
EZCLOSE  closures,  which  do  not  have  the  tag.  This  is  the  purpose  of  the  LABELS- 
FUNCTION  properties  which  BIND-ANALYZE  created;  when  a call  to  an  EZCLOSE 
function  is  generated,  the  presence  of  a LABELS-FUNCTION  property  indicates  that 
the  "closure"  itself  is  not  the  environment,  but  rather  its  cdr  is.  (It  would  be 
possible  to  do  without  the  cell  containing  the  tag,  by  instead  making  up  the 
environment  with  values  of  NIL,  then  constructing  the  "closures"  as  simple 
environments,  and  then  going  back  and  clobbering  the  environment  structure  with 
the  closure  objects,  rather  than  clobbering  the  closure  objects  themselves.  The 
decision  not  to  do  this  was  rather  arbitrary.)  The  generated  code  for  the 
EZCLOSE  case  thus  looks  like: 

((LAMBDA  (FOO  BAR) 

(SETQ  **ENV**  (CONS  ...  (CONS  X43  **ENV***) . . . ) ) 

( RPLACD  (CDDR  BAR)  (CONS  ...  (CONS  X72  **ENV**) . . . ) ) 

(RPLACD  (CDDR  FOO)  (CONS  ...  (CONS  X69  **ENV**) . . . ) ) 

<body>) 

(LIST  'FOO-TAG) 

(LIST  'BAR-TAG)) 

In  the  NOCLOSE  case,  no  closures  are  made  at  run  time  for  the  labelled 
functions,  and  so  the  code  consists  merely  of  the  FNENV-FIX  (which,  again,  using 
the  current  heuristic  in  CLOSE-ANALYZE  will  always  be  null  in  the  NOCLOSE  case) 
and  the  code  for  the  body: 

( PROGN  (SETQ  **ENV**  (CONS  ...  (CONS  X43  **ENV**) . . . ) ) <body>) 

In  any  case,  of  course,  the  labelled  functions  are  added  to  the  FNS  list  which  is 
handed  back  to  C for  later  compilation. 


PRODUCE-LAMBDA-COMBINATION  generates  code  for  the  case  of  ((CLAMBDA  ...) 
argl  ...  argn).  First  a number  of  consistency  checks  are  performed,  to  make 
sure  the  pass-2  analysis  is  not  completely  awry.  Then  code  is  generated  for  the 
body  of  the  CLAMBDA,  using  COMP-BODY.  Then  all  the  arguments,  which  are  of 
course  expected  to  produce  values,  are  given  to  MAPANALYZE,  which  will  call 
ANALYZE  on  each  in  turn  and  return  a list  of  the  pieces  of  generated  code  (here 
called  ARGS  in  the  continuation  handed  to  MAPANALYZE).  Finally,  LAMBDACATE  is 
called  to  generate  the  code  for  entering  the  body  after  setting  up  the  arguments 
in  an  appropriate  manner.  Notice  the  use  of  SET-UP-ASETVARS  to  generate  any 
necessary  additional  code  for  adding  ASETVARS  to  the  consed  environment  on 
entering  the  body.  (A  more  complicated  compiler  would  in  this  situation  add  the 
argument  values  to  the  consed  environment  directly,  rather  than  first  putting 
them  in  registers  (which  is  done  by  LAMBDACATE)  and  then  moving  the  registers 
into  the  consed  environment  (which  is  done  by  SET-UP-ASETVARS).  To  do  this, 
however,  would  involve  destroying  the  modular  distinction  between  LAMBDACATE  and 
SET-UP-ASETVARS.  The  extra  complications  were  deemed  not  worthwhile  because  in 
practice  the  ASETVARS  set  is  almost  always  empty  anyway.) 
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002  (DEFINE 

003  PRODUCE-LAMBDA-COMBINATION 

004  (LAMBDA  (CNOOE  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C) 

005  (LET  ( ( CFM  ( C NODE \C FORM  CNOOE))) 

006  (LET  ((FN  ( CNOOE \CFORM  (CAR  ( CCOMB I NAT ION\ARGS  CFM))))) 

007  (AND  (CLAMBOA\CLOSEREFS  FN) 

008  (ERROR  ' (Functional  LAMBDA  hat  CL0SEREF5|  CNOOE  'FAIL-ACT)) 

009  (OR  (EQUAL  CENV  ( CL AMBDA\CONSENV  FN)) 

010  (ERROR  ' (Environment  disagreement!  CNOOE  'FAIL-ACT)) 

011  (OR  (EQ  (CLAMBDA\FNP  FN)  'NOCLOSE) 

012  (ERROR  * INon-NOCLOSE  LAMBDA  In  Function  position!  CNODE  'FAIL-ACT)) 

013  (COMP -BODY 

014  (CLAMBDANBODY  FN) 

015  ( ENVCARCOR  (CLAMBOA\ASETVARS  FN) 

016  (REGSLIST  FN  T (ENVCARCOR  CENV  RNL))) 

0 1 7 PROGNAME 

018  BLOCKFNS 

019  (APPEND  ( CL AMBDANASE TVARS  FN)  CENV) 

020  FNS 

021  (LAMBDA  (BODY  FNS) 

022  (MAPANALYZE  (COR  ( CCOMB INATION\ARGS  CFM)) 

023  RNL 

024  PROGNAME 

025  BLOCKFNS 

026  FNS 

027  (LAMBDA  (ARGS  FNS) 

028  (C  (LAMBDACATE  (CLAMBDAWARS  FN) 

029  ( CL AMBDAXTVARS  FN) 

030  (CLAMBOA\DEP  FN) 

031  ARGS 

032  (REMARK-ON 

033  (CAR  (CCOMB INATION\ARGS  CFM))) 

034  ' **ENV** 

035  ( SE T - UP -ASE TVARS 

036  BOOY 

037  (CLAMBOA\ASE TVARS  FN) 

038  (REGSLIST  FN  NIL  NIL))) 

039  FNS))))))))) 
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PRODUCE-TRIVFN-COMBINATION  handles  a case  like  (CONS  continuation  argl 
argZ),  i.e.  a CCONBINATION  whose  function  position  contains  a TRIVFN.  First  all 
the  arguments  ( excluding  the  continuation!)  are  given  to  MAPANALYZE;  then  a 
dispatch  is  made  on  whether  the  continuation  is  a CONTINUATION  or  a CVARIABLE, 
and  one  of  two  specialists  is  called. 

PRODUCE- TRIVFN -COMBINATION-CONTINUATION  handles  a case  like  (CONS 
(CONTINUATION  (Z)  <body>)  argl  argZ).  The  idea  here  is  to  compile  It 
approximately  as  if  it  were 

((CONTINUATION  (Z)  <body>)  (CONS  argl  arg2)) 

That  is,  the  arguments  are  evaluated,  the  trivial  function  is  given  them  to 
produce  a value,  and  that  value  is  then  given  to  the  continuation.  Accordingly, 
the  body  of  the  CONTINUATION  is  compiled  using  COMP-BODY,  and  then  LAMBDACATE 
takes  care  of  setting  up  the  argument  (the  fourth  argument  to  LAMBDACATE  is  a 
list  of  the  MacLISP  code  for  invoking  the  trivial  function)  and  invoking  the  body 
of  the  (necessarily  NOCLOSE)  CONTINUATION. 
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017 
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022 

023 

024 

025 

026 

027 

028 

029 

030 

031 

032 

033 
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(DEFINE  PRODUCE  - TR I VFN-COMB INAT ION 

(LAMBDA  (CNOOE  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C) 

(LET  ( (CFM  ( CNOOE \CFORM  CNODE ) ) ) 

(LET  ((FN  ( CNODE \CFORM  (CAR  ( CCOMB I NAT IONXARGS  CFM)))) 

(CONT  ( CNODE \CFORM  (CADR  (CC0MBINAT!0N\AR6S  CFM))))) 

(MAPANALYZE  (CDDR  ( CCOMB I NAT I0N\ARGS  CFM)) 

RNL 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBOA  (ARGS  FNS) 

(EQCASE  (TYPE  CONT) 

(CONTINUATION 

( PRODUCE  - TR  I VF  N-COMB  INAT  ION  -CONT  INUAT  ION 
CNODE  RNL  PROGNAME  BLOCKFNS  CENV 
FNS  C CFM  FN  CONT  ARGS)) 

(CVARIABLE 

( PRODUCE  - TR  I VF  N-COMB  I NAT  ION -CVAR I ABLE 
CNOOE  RNL  PROGNAME  BLOCKFNS  CENV 
FNS  C CFM  FN  CONT  ARGS))))))))) 

( OEF INE  PRODUCE- TR I VFN-C0M8 INAT ION -CONT INUAT  ION 

(LAMBDA  (CNODE  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C CFM  FN  CONT  ARGS) 

(BLOCK  (AND  (CONT INUAT IONNCLOSEREFS  CONT) 

(ERROR  • (CONTINUATION  for  TR1VFN  has  CLOSEREFSf  CNODE  ’FAIL-ACT)) 

(OR  (EQ  ( CONT INUAT ION\FNP  CONT)  ’NOCLOSE) 

(ERROR  ’ |Non-NOCLOSE  CONTINUATION  for  TRIVFN|  CNODE  ’FAIL-ACT)) 
(COMP-BODY  (CONT INUAT ION\BODY  CONT) 

(IF  (CONT INUAT IONYTVARS  CONT) 

(CONS  (CONS  (CAR  (CONT INUAT ION\TVARS  CONT)) 

(TEMPLOC  (CONT INUAT ION\DEP  CONT))) 

(ENVCARCDR  CENV  RNL)) 

( ENVCARCDR  CENV  RNL)) 

PROGNAME 

BLOCKFNS 

CENV 

FNS 

(LAMBDA  (BOOY  FNS) 

(C  (LAMBDACATE 

(LIST  (CONT INUAT IONWAR  CONT)) 

( CONT I NUAT ION\TVARS  CONT) 

(CONTINUATIONXDEP  CONT) 

(LIST  ”(  .(VAR  I ABLE  WAR  (NODEVFORM  ( TRIV1ALNNOOE  FN))) 
(ARGS)) 

(REMARK-ON  (CADR  ( CCOMB INAT ION\AR6S  CFM))) 

' **ENV*» 

BODY) 

E»S)))))) 


I 


224 


PRODUCE-TRIVFN-COMBINATION-CVARIABLE  handles  a case  like  (CONS  CONT-43 
argl  arg2),  where  the  continuation  for  a trivial  function  call  is  a CVARIABLE. 
In  this  situation  the  continuation  is  given  to  ANALYZE  to  generate  MacLISP  code 
for  referring  to  it;  there  are  then  two  cases,  depending  on  whether  the 
CVARIABLE  has  a KNOWN-FUNCTION  property.  (Note  that  before  the  decision  is  made, 
VAL  names  the  piece  of  MacLISP  code  for  calling  the  trivial  function  on  the 
arguments . ) 

If  the  CVARIABLE  denotes  a KNOWN- FUNCTION,  then  it  should  be  possible  to 
invoke  it  by  adjusting  the  environment,  setting  up  the  arguments  in  registers, 
and  jumping  to  the  code.  First  the  environment  adjustment  is  computed;  ADJUST- 
KNOWNFN-CENV  generates  a piece  of  MacLISP  code  which  will  at  run  time  compute  the 
correct  new  environment  in  which  the  continuation  will  expect  to  run.  There  are 
then  two  subcases,  depending  on  whether  the  KNOWN-FUNCTION  is  of  type  NOCLOSE  or 
not.  If  it  is,  then  LAMBDACATE  is  used  to  set  up  the  arguments  in  the 
appropriate  registers  (the  last  argument  of  NIL  indicates  that  there  is  no 
"body",  but  rather  that  the  caller  of  LAMBDACATE  takes  the  responsibility  of 
jumping  to  the  code).  If  it  is  not,  then  PSETQIFY  is  used,  because  the  value 
will  always  go  in  **ONE**  (which  is  the  car  of  **ARGUMENT-REGISTERS**) . In 
either  case,  a GO  is  generated  to  jump  to  the  code  (within  the  current  module,  of 
course)  for  the  continuation. 

If  the  continuation  is  not  a KNOWN-FUNCTION,  then  the  standard  function 
linkage  mechanism  is  used:  the  continuation  is  put  into  **FUN**,  the  value  into 
**ONE**,  and  then  (RETURN  NIL)  exits  the  module  to  request  the  SCHEME  run-time 
interface  to  invoke  the  continuation  in  whatever  manner  is  appropriate. 
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(DEF INE  PRODUCE  - TR I VFN- COMB  t NAT ION-CVARI ABIE 

(LAMBDA  ( CNODE  RNL  PR06NAME  BIOCKFNS  CENV  FNS  C CFM  FN  CONT  ARGS) 

(ANALYZE 

(CADR  ( CCOMB INAT ION\ARGS  CFM)) 

RNL 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBDA  (CONTF  FNS) 

(LET  ((KF  (GET  (CVAR1ABLEWAR  CONT)  ' KNOWN-FUNCT ION ) ) 

(VAL  ’( .(VARIABLE  WAR  (NODEVFORM  ( TRIVIAL  \NODE  FN)))  BARGS))) 

(IF  KF 

(LET  ( (KCFM  (CNODE \CF0RM  KF))) 

(LET  ((ENVADJ 

( AD JUST -KNOVNFN -CENV  CENV 

(CVAR1  ABLE  WAR  CONT) 

CONTF 

(CONTINUAT IONNFNP  KCFM) 

(APPEND 

(CONTINUATIONNCLOSEREFS  KCFH) 
(CONTINUAT IONNCONSENV  KCFM))))) 

(C  *( PROGN 

f ( IF  (EQ  (CONT!NUATION\FNP  KCFM) 

'NOCLOSE) 

(DEPROGNIFY 

(LAMBDACATE  (LIST  (CONTINUAT  ION  WAR  KCFM)) 

( CONTINUAT 10N\TVARS  KCFM) 
(CONTINUATIONSDEP  KCFH) 

(LIST  VAL) 

(REHARK-ON  KF ) 

ENVADJ 

NIL)) 

(PSETOIFY  (LIST  ENVADJ  VAL) 

(LIST  •»*ENV** 

(CAR  **ARGWMENT-REG  IS  TENS**  ) ) }) 
(GO  .(CONTINUATIONWAME  KCFM))) 

ENS))) 

(C  ‘(PROGN  (SETO  »»FUN»*  .CONTF) 

(SETQ  .(CAR  **ARGUHENT-REGISTERS**)  .VAL) 

(RETURN  NIL)) 

FNS))))))) 
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PRODUCE-COMBINATION  handles  combinations  whose  function  positions  contain 
neither  TRIVFNs  nor  CLAMBDAs.  All  of  the  arguments,  including  the  function 
position  itself  and  the  continuation,  are  given  to  NAPANALYZE,  resulting  in  a 
list  FORM  of  pieces  of  MacLISP  code.  There  are  then  two  cases.  If  the  function 
position  is  a VARIABLE  (within  a TRIVIAL  - not  a CVARIABLE! ) , then  PRODUCE- 
COMHINATION-VARIABLE  is  used.  Otherwise  code  is  generated  to  use  the  standard 
SCHEME  run-time  interface:  first  set  **FUN**  to  the  function,  then  set  up  the 
arguments  in  the  standard  argument  registers  (PSETQ-ARGS  generates  the  code  for 
this),  then  set  **NARGS**  to  the  number  of  arguments  (this  does  not  include  the 
continuation),  and  exit  the  module  with  (RETURN  NIL). 


PRODUCE-COMBINATION-VARIABLE  first  determines  whether  the  variable  has  a 
KNOWN -FUNCTION  property.  If  so,  then  the  approach  is  very  much  as  in  TRIVFN- 
COMB I NAT  I ON -CVARIABLE : first  the  environment  adjustment  is  computed,  then  either 
LANBDACATE  or  PSETQ-ARGS-ENV  is  used  to  adjust  the  environment  and  set  up  the 
arguments,  and  finally  a GO  to  the  piece  of  code  for  the  KNOWN-FUNCTION  is 
generated . 

If  the  variable  is  not  a KNOWN-FUNCTION,  then  it  may  still  be  in  the  list 
BLOCKFNS  (which,  recall,  is  a list  of  user  functions  included  in  this  module). 
If  so,  the  effect  on  the  code  generation  strategy  is  roughly  as  if  it  were  a 
KNOWN-FUNCTION.  The  environment  adjustment  is  done  differently,  but  a GO  is 
generated  to  the  piece  of  code  for  the  called  function. 

In  any  other  case,  the  standard  interface  is  used.  **FUN**  is  set  to  the 
function,  the  arguments  are  set  up,  **NARGS**  is  set  to  the  number  of  arguments, 
and  (RETURN  NIL)  exits  the  module. 
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(DEFINE  PRODUCE-COMBINATION 

(LAMBDA  (CNODE  RNL  PROGNAME  BIOCKFNS  CENV  FNS  C) 

(MAP ANALYZE  ( CCOMB I NAT ION\ARGS  (CNOOEVCFORM  CNODE)) 

RNL 

PROGNAME 

8L0CKFNS 

FNS 

(LAMBDA  (FORM  FNS) 

(C  (LET  ((F  (CNOOEVCFORM  (CAR  (CCOMB INATION\ARGS 

(CNOOEVCFORM  CNODE)))))) 

(IF  (AND  (EQ  (TYPE  F)  'TRIVIAL) 

(EQ  (TYPE  (NODEVFORM  ( TR IVIAL VNODE  F))) 
'VARIABLE)) 

(LET  ((V  (VARIABLEWAR 

(NODEVFORM  ( TRIVIAL VNODE  F ) ) ) ) ) 
(PRODUCE-COMBINATION-VARIABLE 
CNOOE  RNL  PROGNAME  BLOCKFNS  CENV 
FNS  C FORM  V (6ET  V 'KNOWN-FUNCTION))) 
"(PROGN  (SETQ  **FUN*»  .(CAR  FORM)) 

•(PSETQ-ARGS  (CDR  FORM)) 

(SETQ  **NARGS»*  '.(LENGTH  (CDDR  FORM))) 
(RETURN  NIL)))) 

FNS))))) 


(DEF INE  PRODUCE-COMBINATION-VARIABLE 

(LAMBDA  (CNODE  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C FORM  V KFN) 

(IF  KFN 

(LET  ( ( ENVAD J 

(ADJUST -KNOWNFN-CENV  CENV 
V 

(CAR  FORM) 

(CLAM60AVFNP  (CNODE VCFORM  KFN)) 

(APPEND  (CLAMBDAVCLOSEREFS  (CNOOEVCFORM  KFN)) 

(CLAMBOAVCONSENV  (CNOOEVCFORM  KFN)))))) 
(OR  (EQ  (TYPE  (CNOOEVCFORM  KFN))  'CLAMBDA) 

(ERROR  ' | Known  function  not  CLAMBDA | CNOOE  'FAIL-ACT)) 

"(PROGN  f( IF  (EQ  (CLAM8DAVFNP  (CNOOEVCFORM  KFN))  'NOCLOSE) 

(DEPROGNIFY 

(LAM60ACATE  (CLAMBDAVVARS  (CNOOEVCFORM  KFN))  , 

( CL AMBDAVTVARS  (CNOOEVCFORM  KFN)) 

(CLAMBDAVDEP  (CNOOEVCFORM  KFN)) 

(COR  FORM) 

(REMARK-ON  KFN) 

ENVAOJ 
• NIL)) 

(PSETQ-ARGS-ENV  (COR  FORM)  ENVAOJ)) 

(GO  .(CLAMBOAVNAME  (CNOOEVCFORM  KFN))))) 

( IF  ( ASSQ  V BLOCKFNS) 

"(PROGN  f( PSETQ-ARGS  (COR  FORM)) 

f ( IF  (NOT  (EQUAL  (CLAMBOAVCONSENV 
(CNOOEVCFORM 

(CAOR  (ASSQ  V BLOCKFNS)))) 

CENV)) 

"((SETQ  **ENV*»  (CODOR  .(CAR  FORM))))) 

(GO  .(CLAMBOAVNAME  (CNODE VCFORM  (CADR  (ASSQ  V BLOCKFNS)))))) 
"(PROGN  (SETQ  *»FUN**  .(CAR  FORM)) 
f (PSETQ-ARGS  (CDR  FORM)) 

(SETQ  **NARGS**  '.(LENGTH  (CODR  FORM))) 

(RETURN  NIL)))))) 
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ADJUST-KNOWNFN-CENV  computes  a piece  of  code  for  adjusting  the 
environment.  CENV  is  the  internal  representation  (as  a list  of  variable  names) 
of  the  environment  in  which  the  generated  code  will  be  used.  VAR  is  the  name  of 
the  variable  which  names  the  function  to  be  invoked,  and  for  whose  sake  the 
environment  is  to  be  adjusted.  VARREF  is  a piece  of  NacLISP  code  by  which  the 
run-time  value  of  VAR  may  be  accessed.  FNP  is  the  FNP  type  of  the  KNOWN-FUNCTION 
denoted  by  VAR.  LCENV  is  the  representation  of  the  environment  for  the  function. 
Thus,  the  generated  code  should  compute  LCENV  given  CENV. 

The  two  easy  cases  are  when  LCENV=CENV,  in  which  case  the  environment 
does  not  change,  and  when  LCENV=NIL,  in  which  case  the  run-time  environment  will 
also  be  NIL.  Otherwise  it  breaks  down  into  three  cases  on  FNP. 

For  FNP=NOCLOSE,  it  must  be  true  that  LCENV  is  some  tail  of  CENV;  that 
is,  there  is  a stack-like  discipline  for  NOCLOSE  functions,  and  so  CENV  was 
constructed  by  adding  things  to  LCENV.  The  piece  of  code  must  therefore  consist 
of  some  number  of  CDR  operations  on  **ENV**.  If  this  operation  does  not  in  fact 
produce  LCENV,  then  there  is  an  inconsistency  in  the  compiler. 

For  FNP=EZCLOSE,  then  VARREF  can  be  used  to  reference  the  run-time 
"closure";  this  may  require  a CDR  operation  if  the  function  is  an  EZCLOSE 
LABELS- FUNCTION  (see  PRODUCE-LABELS). 

For  FNP=NIL,  then  VARREF  will  refer  to  a full  closure;  the  CDDDR  of  this 
closure  is  the  environment. 


PRODUCE-CONTINUATION-RETURN  is,  mutatis  mutandis,  identical  to  PRODUCE- 
LAMBDA-  COMB  I NAT  ION.  This  is  a good  example  of  the  fact  that  much  code  was 
duplicated  because  of  the  early  design  decision  to  treat  COMBINATION  and  RETURN 
as  distinct  data  types. 
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(OEFINE  AOJUST-KNOWNFN-CCNV 

(LAMBOA  (CF:iV  VAR  VARREF  FNP  LCENV) 

(COHO  ((EQUAL  LCENV  CENV)  '**ENV*») 

((NULL  LCENV)  'NIL) 

(T  (EOCASE  FNP 

(NOCLOSE 

(00  ((X  CENV  (COR  X)) 

(Y  ' **ENV*»  "(COR  ,Y)) 

(I  (-  (LENGTH  CENV)  (LENGTH  LCENV))  (-  I 1))) 

((<  I 1) 

( IF  (EQUAL  X LCENV) 

( OECARCDRATE  Y) 

(ERROR  ' ICannot  recover  environment  for  known  function! 
VAR 

•FAIL-ACT))))) 

(EZCLOSE 

(IF  (GET  VAR  'LABELS -FUNCTION) 

*(COR  .VARREF) 

VARREF  ) ) 

(NIL  "(CODOR  .VARREF))))))) 

( OEF INE  PRODUCE -CONTINUATION -RETURN 

(LAMBOA  ( CNOOE  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C) 

(LET  ( (CFM  ( CNOOE \CFORM  CNOOE))) 

(LET  ((FN  (CNODENCFORH  ( RETURN\CONT  CFM)))) 

(ANO  (CONT 1NUAT tON\CLOSEREF S FN) 

(ERROR  1 (Functional  CONTINUATION  has  CLOSEREFSI  CNOOE  ’FAIL-ACT)) 
(OR  (EQUAL  CENV  (CONTINUATION\CONSENV  FN)) 

(ERROR  1 (Environment  disagreement!  CNOOE  ’FAIL-ACT)) 

(OR  (EQ  (CONTINUATIONVFNP  FN)  ’NOCLOSE) 

(ERROR  ' | Non- NOCLOSE  CONTINUATION  In  function  position! 

CNOOE 

•FAIL-ACT)) 

(COMP -BODY  ( CONT INUAT IONNBOOY  FN) 

(IF  ( CONT INUAT ION\TVARS  FN) 

(CONS  (CONS  (CAR  (CONTINUATION\TVARS  FN)) 

( i EMPLOC  (CONT INUAT ION\DEP  FN))) 

(ENVCARCDR  CENV  RNL)) 

(ENVCARCOR  CENV  RNL)) 

PROGNAME 

BLOCKFNS 

CENV 

FNS 

(LAMBDA  (BOOY  FNS) 

(ANALYZE  (RETURNWAL  CFM) 

RNL 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBDA  (VAL  FNS) 

(C  (LAMBDACATE 

(LIST  (CONT INUAT IONWAR  FN)) 

( CONT I NUAT ION\TVARS  FN) 

(CONT INUAT ION\OEP  FN) 

(LIST  VAL) 

(REMARK-ON  (RETURNNCONT  CFM)) 
•**ENV«* 

BOOY) 

F«))))))))> 
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PRODUCE-RETURN  and  PRODUCE-RETURN- 1 together  are  almost  identical  to 
PRODUCE-COMBINATION  and  PRODUCE-COMBINATION-VARIABLE,  except  that  the  division 
between  the  two  parts  is  different,  and  the  BLOCKFNS  trick  is  not  applicable  to 
RETURN. 


PRODUCE-RETURN  merely  calls  ANALYZE  on  each  of  the  continuation  and  the 
value,  and  calls  PRODUCE-RETURN-1. 

PRODUCE-RETURN- 1 checks  to  see  whether  the  continuation  is  a KNOWN- 
FUNCTION.  If  so,  the  environment  adjustment  is  computed,  and  code  is  generated 
in  a way  similar  to  previous  routines.  If  not,  the  standard  interface  (involving 
(RETURN  NIL))  is  used.  Notice  the  check  to  see  if  VAL  is  in  fact  **ONE**  (the 
car  of  **ARGUMENT-REGISTERS**) ; if  so,  the  redundant  code  (SETQ  **ONE**  **ONE**) 
is  suppressed. 
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(OEFINE  PRODUCE -RE TURN 

(LAMBDA  ( CNODE  RNL  PROGNAME  BLOCKFNS  CENV  FNS  C) 

(LET  ( (CFM  ( CNODE \CFORM  CNODE))) 

(ANALYZE  (RETURNWAL  CFM) 

RNL 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBOA  (VAL  FNS) 

(ANALYZE  (RETURNNCONT  CFM) 

RNL 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBDA  (CONT  FNS) 

(PRODUCE-RETURN-1 
CNOOE  RNL  PROGNAME  BLOCKFNS 
CENV  FNS  C CFM  VAL  CONT)))))))) 


(OEFINE  PROOUCE - RE  TURN  - 1 

(lambda  (cnode  rnl  progname  blockfns  cenv  fns  c cfm  val  cont) 

(IF  (AND  (EQ  (TYPE  (CNOOENCFORM  (RETURNNCONT  CFM)))  'CVARIABLE ) 

(GET  (CVARIABLE WAR  (CNOOENCFORM  (RETURNNCONT  CFM))) 

’KNOWN-FUNCTION)) 

(LET  ( ( KCFM  (CNODE NCFORM 

(GET  (CVARIABLENVAR 

(CNODE NCFORM  (RETURNNCONT  CFM))) 

’KNOWN-FUNCTION)))) 

(OR  (EQ  (TYPE  KCFM)  ’CONTINUATION) 

(ERROR  ' (Known  function  not  CONTINUATION)  CNOOE  'FAIL-ACT)) 

(LET  ((ENVADJ 

( ADJUST-KNOWNr N-CENV  CENV 

(CVARIABLENVAR  (CNOOE NCFORM  (RETURNNCONT  CFM))) 
CONT 

( CONT I NUAT IONNF NP  KCFM) 

(APPENO 

(CONTINUATIONNCLOSEREFS  KCFM) 
(CONTINUATIONNCONSENV  KCFM))))) 

(C  * ( PR06N  t( IF  (EQ  (CONT I NUAT IONNF NP  KCFM)  ’NOCLOSE) 

(OEPROGNIFY 

(LAMBOACATE  (LIST  ( CONT I NUAT IONNVAR  KCFM)) 

( CONT I NUAT IONNTVARS  KCFM) 

(CONT1NUATIONNOEP  KCFM) 

(LIST  VAL) 

(REMARK-ON 
(GET  (CVARIABLENVAR 

(CNOOENCFORM  (RETURNNCONT  CFM))) 
'KNOWN-FUNCTION)) 

ENVADJ 

NIL)) 

(PSETQIFY  (LIST  ENVAOJ  VAL) 

(LIST  '**ENV** 

(CAR  **AR6UMENT-RE6ISTERS** ) ) ) ) 

(GO  .(CONTINUATIONNNAME  KCFM))) 

FNS))) 

(C  " ( PR06N  (SETQ  **FUN**  .CONT) 

•(IF  (NOT  (EQ  VAL  (CAR  ••ARGUMENT-REGISTERS**))) 

‘((SETQ  .(CAR  ••ARGUMENT-REGISTERS** ) .VAL))) 

(RETURN  NIL)) 


FNS)))) 
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LANBDACATE  generates  code  for  invoking  a NOCLOSE  KNOWN-FUNCTION.  It 
arranges  for  the  arguments  to  be  evaluated  and  put  in  the  proper  registers,  and 
also  performs  some  optimizations. 

VARS  is  a list  of  the  variables  which  are  to  be  bound.  TVARS  is  a list 
of  those  variables  (a  subset  of  VARS)  which  will  actually  be  passed  through 
registers,  as  specified  by  the  TVARS  slot  of  the  CLAMBDA  or  CONTINUATION;  this 
is  used  for  a consistency  check  on  the  optimizations  of  LAMBDACATE.  DEP  is  the 
reqister  depth  of  the  function  (the  DEP  slot).  ARGS  is  a list  of  pieces  of 
MacLISP  code  which  have  been  generated  for  the  arguments  to  the  function.  REN  is 
a comment  (usually  one  generated  by  REMARK-ON)  to  be  included  in  the  generated 
code  for  debugging  purposes;  this  comment  typically  details  the  state  of  the 
environment  and  what  variables  are  being  passed  through  registers  at  this  point. 
ENVADJ  is  a piece  of  MacLISP  code  (usually  generated  by  ADJUST-KNOWNFN-CENV)  to 
whose  value  **ENV**  is  to  be  set,  to  adjust  the  environment.  BODY  may  be  a list 
of  pieces  of  MacLISP  code  which  constitute  the  body  of  the  known  function,  to  be 
executed  after  the  arguments  are  set  up  (typically  because  of  a combination  like 
((LAMBDA  ...)  ...)),  or  it  may  be  NIL,  implying  that  the  caller  of  LAMBDACATE 
intends  to  generate  a GO  to  the  code. 

LAMBDACATE  divides  ARGS  into  three  classes:  (1)  arguments  which  are 
themselves  NOCLOSE  KNOWN-FUNCTIONs  --  such  arguments  actually  have  no  actual  run- 
time representation  as  a MacLISP  data  object,  and  so  are  not  passed  at  all;  (2) 
arguments  whose  corresponding  variables  are  never  referenced  --  these  are 
accumulated  in  EFFARGS,  a list  of  arguments  to  be  evaluated  for  effect  only 
(presumably  the  optimizer  eliminated  those  unreferenced  arguments  which  had  no 
side  effects);  and  (3)  arguments  whose  values  are  needed  and  are  to  be  passed 
through  the  registers  --  these  are  accumulated  in  REALARGS,  and  the  corresponding 
variables  in  REALVARS. 

When  this  loop  is  done,  (the  reverse  of)  REALVARS  should  equal  TVARS,  for 
it  is  the  set  of  actually  passed  arguments. 

The  generated  code  first  evaluates  all  the  EFFARGS  (if  any),  then  sets 
all  the  proper  registers  to  the  REALARGS  (this  code  is  generated  by  PSETQ-TEMPS) , 
then  (after  the  remark  REM)  executed  the  BODY  (which,  if  NIL,  is  empty). 

For  example,  consider  generating  code  for: 

((LAMBDA  (FAB)  ...  (FA)  ...) 

(LAMBDA  (X)  ...) 

(CONS  X Y) 

(PRINT  Z)) 

where  F denotes  a NOCLOSE  KNOWN-FUNCTION,  and  B is  never  referred  to.  Then  the 
call  to  LAMBDACATE  might  look  like  this: 
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; HANDLE  CASE  OF  INVOKING  A KNOWN  NOCLOSE  FUNCTION  OR  CONTINUATION. 

; FOR  AN  EXPLICIT  ((LAMBDA  ...  BODY)  ...),  BODY  IS  THE  BOOY. 

; OTHERWISE.  IT  IS  NIL.  AND  SOMEONE  WILL  DO  AN  APPROPRIATE  GO  LATER. 


(DEFINE  LAMBDACATE 

(LAMBDA  (VARS  TVARS  DEP  ARGS  REM  ENVAOJ  BODY) 

(LABELS  ((LOOP 

(LAMBDA  (V  A REALVARS  REALARGS  EFFARGS) 

; ;REALVARS  IS  COMPUTEO  PURELY  FOR  ERROR-CHECKING 
(IF  (NULL  A) 

(LET  ((B  "(PROGN  f( PSETQ- TEMPS  (NREVERSE  REALARGS)  DEP  ENVADJ ) 
.REM 

0( DEPROGNIFY  BODY))) 

(RV  (NREVERSE  REALVARS))) 

(IF  (NOT  (EQUAL  RV  TVARS)) 

(ERROR  '|TVARS  screwup  In  LAMBDACATE  | 

•((VARS  » .VARS) 

(TVARS  » .TVARS) 

(REALVARS  • .RV)) 

•FAIL-ACT)) 

(IF  EFFARGS 

•(PROGN  (EFFARGS  ((DEPROGNIFY  6)) 

• >> 

(CONO  ((LET  ((KFN  (GET  (CAR  V)  'KNOWN-FUNCTION))) 

(AND  KFN 

(EQ  (EQCASE  (TYPE  (CNOOEXCFORM  KFN)) 

(CLAMBDA 
(CLAMBDAXFNP 
(CNOOENCFORM  KFN))) 

(CONTINUATION 
( CONT I NUAT ION\F NP 
(CNOOENCFORM  KFN)))) 

' NOCLOSE))) 

(LOOP  (COR  V)  (COR  A)  REALVARS  REALARGS  EFFARGS)) 

((OR  (6ET  (CAR  V)  'READ-REF S) 

(GET  (CAR  V)  'WRITE-REFS)) 

(LOOP  (COR  V) 

(COR  A) 

(CONS  (CAR  V)  REALVARS) 

(CONS  (CAR  A)  REALARGS) 

EFFARGS)) 

(T  (LOOP  (COR  V) 

(COR  A) 

REALVARS 

REALARGS 

(CONS  (CAR  A)  EFFARGS)))))))) 

(LOOP  VARS  AR6S  NIL  NIL  NIL)))) 
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(LAMBDACATE  ' (F  A B) 

'(A) 

12 

1 ( <illegal>  (CONS  X43  Y69)  (PRINT  Z91)) 

<remark> 

**ENV** 

<body>) 

where  <illegal>  is  an  object  that  should  never  be  looked  at  (see  ANALYZE- 
CLAMBDA);  X43,  Y69,  and  Z91  are  pieces  of  code  which  refer  to  the  variables  X, 
Y,  and  Z;  <remark>  is  some  remark;  the  environment  adjustment  is  assumed  to  be 
trivial;  and  <body>  is  the  code  for  the  body  of  the  LAMBDA.  The  generated  code 
would  look  something  like  this: 

(PROGN  (PRINT  Z91) 

(SETQ  -12-  (CONS  X43  Y69)) 

<remark> 

<body> ) 

Notice  that  LAMBDACATE  explicitly  takes  advantage  of  the  fact  that  the  execution 
of  arguments  for  a combination  may  be  arbitrarily  reordered. 


The  various  PSETQ. . . routines  generate  code  to  perform  Parallel  SETQs, 
i.e.  the  simultaneous  assignment  of  several  values  to  several  values.  The 
parallel  nature  is  important,  because  some  of  the  values  may  refer  to  other 
registers  being  assigned  to,  and  a sequential  series  of  assignments  might  not 
work . 

The  main  routine  here  is  PSETQIFY,  which  takes  a list  of  arguments 
(pieces  of  MacLISP  code  which  will  generate  values  when  executed  at  run-time)  and 
a list  of  corresponding  registers.  One  of  two  different  methods  is  used 
depending  on  the  number  of  values  involved.  Method  2 produces  better  code  (this 
is  obvious  only  when  one  understands  the  properties  of  the  MacLISP  compiler  which 
will  compile  the  MacLISP  code  into  PDP-10  machine  language).  Unfortunately,  it 
happened  that  when  RABBIT  was  written  there  was  a bug  in  the  MacLISP  compiler 
such  that  it  often  found  itself  unable  to  compile  the  code  generated  by  Method  2. 
Moreover,  the  primary  maintainer  of  the  MacLISP  compiler  was  on  leave  for  a year. 
For  this  reason  Method  3 was  invented,  which  always  works,  but  is  considerably 
more  expensive  in  terms  of  the  PDP-10  code  produced.  (I  concerned  myself  with 
this  low  level  of  detail  only  for  this  routine,  because  the  code  it  produces  is 
central  to  the  whole  code  generator,  and  so  its  efficiency  is  of  the  greatest 
importance.)  In  order  to  achieve  the  best  code,  I determined  empirically  that 
Method  7.  never  failed  as  long  as  fewer  than  five  values  were  involved.  I might 
also  add  that  a Method  1 was  once  used,  which  happened  to  provoke  a different  bug 
in  the  MacLISP  compiler;  Method  2 was  invented  in  an  attempt  to  circumvent  that 
first  hug!  Now  that  the  maintainer  of  the  MacLISP  compiler  (Jon  L White)  has 
returned,  it  may  soon  be  possible  to  remove  Method  3 from  RABBIT;  but  I think 
this  story  serves  as  an  excellent  example  of  pragmatic  engineering  to  get  around 
immediate  obstacles  (also  known  as  a "kludge"). 
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;;;  GENERATE  PARALLEL  SETQ’ING  OF  REGISTERS  TO  ARGS. 

;;;  RETURNS  A LIST  OF  THINGS;  ONE  WRITES  l(PSETQIFY  ...)  WITHIN  *. 

(DEFINE  PSETQIF Y 

(LAMBDA  (ARGS  REGISTERS) 

(IF  (<  (LENGTH  ARGS)  5) 

( PSETQIFY-METH00-2  ARGS  REGISTERS) 

(PSETQIFY-METHOO-3  ARGS  REGISTERS)))) 

(OEFINE  PSETQIFY-METHOO-2 

(LAMBDA  (ARGS  REGISTERS) 

(LABELS  ( ( PSETQ1 

(LAMBDA  (A  REGS  QVARS  SETQS  USED) 

(IF  (NULL  A) 

(IF  (NULL  SETQS) 

NIL 

(IF  (NULL  (CDR  SETQS)) 

"( (SETQ  .(CADAR  SETQS)  .(CAR  USED))) 

;; IMPORTANT;  DO  NOT  NRE VERSE  THE  SETOS! 
; ; MAKES  MACLISP  COMPILER  WIN  BETTER. 
‘(((LAMBDA  ,( NRE VERSE  QVARS)  RSETQS) 

!( NREVERSE  USED))))) 

(IF  (EQ  (CAR  A)  (CAR  RE6S ) ) 

(PSETQ1  (CDR  A) 

(CDR  REGS) 

QVARS 


;AVOIO  USELESS  SETQ’S 


SETQS 

USEO) 

((LAMBDA  (QV) 

(PSETQ1  (COR  A) 

(CDR  REGS) 

(CONS  QV  QVARS) 

(CONS  ‘(SETQ  .(CAR  REGS)  .QV)  SETQS) 
(CONS  (CAR  A)  USED))) 

(GENTEMP  *Q))))))) 

(PSETQ1  ARGS  REGISTERS  NIL  NIL  NIL)))) 
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Method  2 essentially  uses  local  MacLISP  LAMBDA  variables  to  temporarily 
name  the  values  before  assignment  to  the  registers,  while  Method  3 uses  global 
variables.  (Method  2 produces  better  code  because  the  MacLISP  compiler  can 
allocate  the  local  variables  on  a stack,  one  by  one,  and  then  pop  them  off  in 
reverse  order  into  the  "registers".)  Both  methods  perform  two  peephole 
optimizations:  (1)  If  a value-register  pair  calls  for  setting  the  register  to 
its  own  contents,  that  SETQ  is  eliminated.  (2)  If  this  elimination  reduces  the 
number  of  SETQs  to  zero  or  one,  then  NIL  or  a single  SETQ  is  produced,  rather 
than  the  more  complicated  and  general  piece  of  code. 

As  examples,  (PSETQIFY  ’(-12-  -12-  (CDR  -13-))  ’(-11-  -12-  -13-))  would 

produce : 

((LAMBDA  (Q-43  Q-44) 

(SETQ  -13-  Q-44) 

(SETQ  -11-  Q-43)) 

-12- 

(CDR  -13-)) 

(note  that  (SETQ  -12-  -12-)  was  eliminated),  and 

(PSETQIFY  '(-23-  -21-  -24-  -25-  -22-)  '(-21-  -22-  -23-  -24-  -25-)) 
would  produce: 

(PROG  ()  (DECLARE  (SPECIAL  -21--TENP  -22— TEMP  -23— TEMP  -24— TEMP  -25— TEMP) 
(SETQ  -25--TEMP  -Z2-) 

(SETQ  -24--TEMP  -25-) 

(SETQ  -23--TEMP  -24-) 

(SETQ  -22--TEMP  -21-) 

(SETQ  -21--TEMP  -23-) 

(SETQ  -25-  -25— TEMP) 

(SETQ  -24-  -24  — TEMP) 

(SETQ  -23-  -23--TEMP) 

(SETQ  -22-  -Z2--TEMP) 

(SETQ  -21-  -21— TEMP)) 

The  only  reason  for  using  PROG  is  so  that  the  DECLARE  form  could  be  included  for 
the  benefit  of  the  MacLISP  compiler. 

The  examples  here  are  slightly  incorrect;  PSETQIFY  actually  produces  a 
list  of  MacLISP  forms,  so  that  when  no  SETQs  are  produced  the  resulting  NIL  is 
interpreted  as  no  code  at  all. 

In  principle  the  elimination  of  redundant  SETQs  should  be  performed 
before  choosing  which  method  to  use,  so  that  there  will  be  a maximal  chance  of 
using  the  more  efficient  Method  2.  I chose  not  to  only  so  that  the  two  methods 
would  remain  distinct  pieces  of  code  and  thus  easily  replaceable. 
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(DEFINE  PSETQIFY-METHOD-3 

(LAMBDA  (ARGS  REGISTERS) 

(LABELS  ( ( PSETQ1 

(LAMBDA  (A  REGS  QVARS  SETQS  USED) 

(IF  (NULL  A) 

(IF  (NULL  SETQS) 

NIL 

(IF  I NULL  (COR  SETQS)) 

' { (SETQ  .(CADAR  SETQS)  .(CADDR  (CAR  USED)))) 

*((P«OG  ()  (DECLARE  (SPECIAL  8QVARS ) ) BUSED  tSETQS)  ))) 
(IF  (EQ  (CAR  A)  (CAR  REGS))  ;AVOID  USELESS  SETQ'S 

(PSETQ1  (COR  A) 

(COR  REGS) 

QVARS 

SETQS 

USED) 

((LAMBOA  (QV) 

( PSETQ1  (COR  A) 

(COR  REGS) 

(CONS  QV  QVARS) 

(CONS  ’(SETQ  .(CAR  REGS)  ,QV)  SETQS) 

(CONS  "(SETQ  .QV  .(CAR  A))  USED))) 

(CATENATE  (CAR  RE6S)  • | -TEMPI )))))) ) 

(PSETQ1  ARGS  REGISTERS  NIL  NIL  NIL)))) 
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PSETQ-ARGS  is  a handy  routine  which  calls  PSETQ-ARGS-ENV  with  an  ENVADJ 
of  **ENV**,  knowing  that  later  the  redundant  "(SETQ  **ENV**  **ENV**)"  will  be 
eliminated. 

PSETQ-ARGS-ENV  takes  a set  of  arguments  and  an  environment  adjustment, 
and  arranges  to  call  PSETQIFV  so  as  to  set  up  the  standard  argument  registers. 
Recall  that  how  this  is  done  depends  on  whether  the  number  of  arguments  exceeds 
**NUNBER-OF-ARG-REGS**;  if  it  does,  then  a list  of  the  arguments  (except  the 
continuation)  is  passed  in  **ONE**.  **ENV+CONT+ARG-REGS**  is  the  same  as 
★★ARGUMENT-REGISTERS**,  except  that  both  the  names  **ENV**  and  **CONT**  are 
adjoined  to  the  front.  It  can  be  quite  critical  that  **ENV**  and  the  argument 
registers  be  assigned  to  in  parallel,  because  the  computation  of  the  argument 
values  may  well  refer  to  variables  in  the  environment,  whereas  the  environment 
adjustment  may  be  taken  from  a closure  residing  in  one  of  the  argument  registers. 

PSETQ-TEMPS  is  similar  to  PSETQ-ARGS-ENV,  but  is  used  on  registers  other 
than  the  standard  argument-passing  registers.  It  takes  ARGS  and  ENVADJ  as 
before,  but  also  a depth  DEP  which  is  the  number  of  the  first  register  to  be 
assigned  to.  TEMPLOC  is  used  to  generate  the  register  names,  then  **ENV**  is 
tacked  on  and  PSETQIFY  does  the  real  work. 


MAPANALYZE  is  a simple  loop  which  maps  over  a list  of  cnode-trees  and 
calls  ANALYZE  on  each.  A list  of  the  results  returned  by  ANALYZE  is  given  to  C. 
Also,  FNS  is  chained  through  the  calls  to  ANALYZE,  so  that  all  functions  to  be 
compiled  later  will  have  been  accumulated  properly. 
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(DEFINE  PSETQ-ARGS 

(LAMBDA  (ARGS) 

(PSETQ-ARGS-ENV  ARGS  '«*ENV«*))) 

(DEFINE  PSETQ-ARGS-ENV 

(LAMBDA  (ARGS  ENVADJ) 

(IF  (>  (LENGTH  ARGS)  (+  **NUMBER-OF -ARG-KEGS**  1)) 

(PSETOIFY  (LIST  ENVADJ  (CAR  ARGS)  (CONS  'LIST  (COR  ARGS))) 
* * E NV+CON  T+ ARG -REGS*  * ) 

(PSETQIFY  (CONS  ENVADJ  ARGS)  **ENV+CONT+ARG-REGS** ) ) ) ) 

(DEFINE  PSETQ- TEMPS 

(LAMBDA  (ARGS  DEP  ENVADJ) 

(DO  ((A  ARGS  (COR  A)) 

(J  DEP  (+  J 1)) 

(R  NIL  (CONS  (TEMPLOC  J)  R))) 

((NULL  A) 

(PSETQIFY  (CONS  ENVADJ  ARGS) 

(CONS  ' **ENV**  (NREVERSE  R) )))))) 


(OEFINE  MAP ANALYZE 

(LAMBDA  (FLIST  RNL  PROGNAME  BLOC KF NS  FNS  C) 

(LABELS  ((LOOP 

(LAMBDA  (F  Z FNS) 

(IF  (NULL  F) 

(C  (NREVERSE  Z)  FNS) 

(ANALYZE  (CAR  F) 

RNL 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBDA  (STUFF  FNS) 

(LOOP  (COR  F) 

(CONS  STUFF  Z) 
FNS))))))) 

(LOOP  FLIST  NIL  FNS)))) 
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ANALYZE  is  the  routine  called  to  compile  a piece  of  code  which  is 
expected  to  produce  a value.  ANALYZE  itself  is  primarily  a dispatch  to 
specialists.  For  the  "simple"  case  of  a "trivial"  form,  TRIVIALIZE  is  used  to 
generate  the  code.  For  the  simple  case  of  a CVARIABLE,  ANALYZE  simply  uses 
LOOKUPICATE  to  get  the  code  for  the  variable  reference. 


ANALYZE-CLAMBDA  has  three  cases  based  on  FNP.  For  type  NIL,  code  is 
generated  to  create  a full  closure  of  the  form  (CBETA  <value  of  progname>  <tag>  . 
<environment>) . CONS-CLOSEREFS  generates  the  code  to  add  the  CLOSEREFS  to  the 
existing  consed  environment  for  making  this  closure.  For  type  EZCLOSE,  just  the 
environment  part  is  created,  again  using  CONS-CLOSEREFS.  For  type  NOCLOSE,  the 
generated  "code"  should  never  be  referenced  at  all  --  it  is  not  even  passed  as  an 
argument  as  such  --  and  so  a little  message  to  the  debugger  is  returned  as  the 
"code",  which  of  course  must  not  appear  in  the  final  code  for  the  module.  For 
all  three  cases,  the  code  for  the  function  is  added  to  the  FNS  list  for  later 
compilation . 


ANALYZE -CONTINUATION  is  essentially  identical  to  ANALYZE-CLAMBDA. 


001 

( Dtf  I NE 
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002 

ANALYZE 

003 

(LAMBDA  (CNOOE  RNL  PROGNAME  BLOCKFNS  FNS  C) 

004 

(LET  ( (CFM  (CNOOE \CFORM  CNOOE))) 

005 

( EQCASE  (TYPE  CFM) 

006 

(TRIVIAL 

007 

(C  (TRIVIALIZE  (TRIVIALXNODE  CFM)  RNL)  FNS)) 

008 

(CVARIABLE 

009 

(C  (LOOKUPICATE  (CVARIABLEWAR  CFM)  RNL)  FNS)) 

010 

(CLAMBOA 

on 

(ANALYZE -CLAMBOA  CNOOE  RNL  PROGNAME  BLOCKFNS  FNS  C CFM)) 

012 

(CONTINUATION 

013 

(ANALYZE -CONTINUATION  CNOOE  RNL  PROGNAME  BLOCKFNS  FNS  C CFM)) 

014 

(CIF 

015 

(ANALYZE-CIF  CNOOE  RNL  PROGNAME  BLOCKFNS  FNS  C CFM)) 

016 

(CLABELS 

017 

(ANALYZE -CLABELS  CNOOE  RNL  PROGNAME  BLOCKFNS  FNS  C CFM)) 

018 

(CC0M6INATI0N 

019 

( ANAL YZE -CC0M8 INATION  CNOOE  RNL  PROGNAME  BLOCKFNS  FNS  C CFM)) 

020 

(RETURN 

021 

(ANALYZE -RE TURN  CNOOE  RNL  PROGNAME  BLOCKFNS  FNS  C CFM)))))) 

022 

023 

(DEF  I HE 

ANALYZE -CLAMBOA 

024 

('AMBOA  (CNOOE  RNL  PROGNAME  BLOCKFNS  FNS  C CFM) 

025 

(EQCASE  (CLAMBOA\FNP  CFM) 

026 

(NIL  - 

027 

(C  "(CONS  'CBETA 

028 

(CONS  .PROGNAME 

029 

(CONS  1 ,(CLAMBOA\NAME  CFM) 

030 

.(CONS-CLOSEREFS  ( CLAMBOA\CLOSEREF S CFM) 

031 

KNL))))  | 

032 

(CONS  (LIST  PROGNAME  CNOOE  NIL)  FNS))) 

033 

(EZCLOSE 

034 

(C  (CONS-CLOSEREFS  (CLAMBDA\CLOSEREFS  CFM)  RNL) 

035 

(CONS  (LIST  PROGNAME  CNOOE  NIL)  FNS))) 

036 

(NOCLOSE 

037 

(C  1 IShQuldn* t ever  be  seen  - NOCLOSE  CLAMBOA | 

0 38 

(CONS  (LIST  PROGNAME  CNOOE  RNL)  FNS)))))) 

039 

040 

(DEFINE 

ANAL YZE -CONT INUAT ION 

041 

(LAMBDA  (CNOOE  RNL  PROGNAME  BLOCKFNS  FNS  C CFM) 

042 

(EQCASE  ( CONT I NUAT ION\F NP  CFM) 

043 

(NIL 

044 

(C  "(CONS  ’CBETA 

045 

(CONS  .PR06NAME 

046 

(CONS  1 .(CONTINUATIONXNAME  CFM) 

047 

.(CONS-CLOSEREFS  (CONTINUATION\CLOSEREFS  CFM) 

048 

RNL)))) 

049 

(CONS  (LIST  PROGNAME  CNOOE  NIL)  FNS))) 

050 

(EZCLOSE 

051 

(C  (CONS-CLOSEREFS  (CONTINUATION\CLOSEREFS  CFM)  RNL) 

052 

(CONS  (LIST  PROGNAME  CNOOE  NIL)  FNS))) 

053 

(NOCLOSE 

054 

(C  ’ (Shouldn’t  ever  be  seen  - NOCLOSE  CONTINUATION! 

055 

(CONS  (LIST  PROGNAME  CNOOE  RNL)  FNS)))))) 

242 

ANALYZE-C1F  merely  calls  ANALYZE  recursively  on  the  predicate, 
consequent,  and  alternative,  and  then  uses  CONDICATE  to  construct  a MacLISP  COND 
form. 


ANALYZE-CLABELS  calls  ANALYZE  recursively  on  the  body  of  the  CLABELS,  and 
then  calls  PRODUCE-LABELS  to  do  the  rest.  (Unlike  the  other  PRODUCE-  functions, 
PRODUCE-LABELS  does  not  depend  on  generating  code  which  does  not  produce  a value. 
It  accepts  an  already-compiled  body,  and  builds  around  that  the  framework  for 
constructing  the  mutually  referent  functions.  If  the  body  was  compiled  using 
COMP-BODY,  then  the  code  generated  by  PRODUCE-LABELS  will  produce  no  value;  but 
if  the  body  was  compiled  using  ANALYZE,  then  it  will  produce  a value.) 
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(OEF INE  ANAIYZE-CIF 

(LAMBDA  ( CNOOE  RNL  PROGNAME  BLOCKFNS  F NS  C CFM) 

(ANALYZE  (CIFNPREO  CFM) 

RNL 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBDA  (PRED  FNS) 

(ANALYZE  (CIF\CON  CFM) 

RNL 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBDA  (CON  FNS) 

(ANALYZE  (CIFXALT  CFM) 

RNL 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBDA  (ALT  FNS) 

(C  (CONDICATE  PRED  CON  ALT) 
FNS))))))))) 

(DEFINE  ANALYZE-CLABELS 

(LAMBDA  ( CNOOE  RNL  PROGNAME  BLOCKFNS  FNS  C CFM) 

(ANALYZE  (CLABELSXBODY  CFM) 

( ENVCARCOR  (APPEND  (CLABELSXFNENV  CFM) 

(CLABELSNCONSENV  CFM)) 

RNL) 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBDA  (LBOD  FNS) 

(PRODUCE-LABELS  CNOOE  LBOD  RNL  PROGNAME  BLOCKFNS  FNS  C))))) 
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ANALYZE-CCOMBINATION  requires  the  function  to  be  a CLAMBDA  (for  if  it 
were  not,  then  something  too  complicated  for  continuation-passing  style  is  going 
on).  ANALYZE  is  called  on  the  body  of  the  CLAMBDA,  and  then  on  all  the  arguments 
(using  MAPANALYZE);  finally  LAMBDACATE  is  used  to  generate  the  code. 
(LAMBDACATE  is  much  like  PRODUCE-LABELS,  in  that  it  is  handed  a body,  and  whether 
the  generated  code  produces  a value  depends  only  on  whether  the  body  does.) 


ANALYZE-RETURN  is  essentially  just  like  ANALYZE-CCOMBINATION. 
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(DEF INE 

ANAL YZE -CCOMB I NATION 

(LAMBDA  (CNODE  RNL  PROGNAME  BLOCKFNS  FNS  C CFM) 

(LET  ((FN  ( CNODE  \CFORM  (CAR  (CCOMB INATIONXARGS  CFM))))) 

( IF  (EQ  (TYPE  FN)  ’CLAMBOA) 

(ANALYZE  (CLAMBDANBOOY  FN) 

( ENVCARCOR  (CLAMBDA\ASETVARS  FN) 

(REGSLIST  FN  T (ENVCARCOR  ( CL AMBOAXCONSENV  FN)  RNL))) 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBDA  (BODY  FNS) 

(MAPANALYZE 

(COR  ( CCOMB I N AT ION\ARGS  CFM)) 

RNL 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBDA  (ARGS  FNS) 

(C  (LAMBOACATE  (CLAMBDAWARS  FN) 

(CLAM80AXTVARS  FN) 

(CLAMBDAXDEP  FN) 

ARGS 

(REMARK-ON  (CAR  ( CCOMB INATIONXARGS  CFM))) 
'*»ENV*» 

( SET-UP- ASETVARS  BODY 

( CL AMBOAXASE TVARS  FM) 
(REGSLIST  FN  NIL  NIL))) 

FNS))))) 

(ERROR  1 INon-triviel  Function  in  ANALYZE -CCOMBINATIONI  CNODE  'FAIL-ACT))))) 

(DEFINE  ANALYZE-RETURN 

(LAMBDA  (CNODE  RNL  PROGNAME  BLOCKFNS  FNS  C CFM) 

(LET  ((FN  (CNOOEXCFORM  (RETURNXCONT  CFM)))) 

(IF  (EQ  (TYPE  FN)  'CONTINUATION) 

(ANALYZE  (CONTINUATIONXBODY  FN) 

(IF  ( CONT1NUAT ION\TVARS  FN) 

(CONS  (CONS  (CAR  (CONTINUATIONXTVARS  FN)) 

(TEMPLOC  (CONTINUAT IONXDEP  FN))) 

(ENVCARCOR  (CONTINUATIONXCONSENV  FN)  RNL)) 

(ENVCARCOR  (CONTINUATIONXCONSENV  FN)  RNL)) 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBOA  (BODY  FNS) 

(ANALYZE  (RETURNWAL  CFM) 

RNL 

PROGNAME 

BLOCKFNS 

FNS 

(LAMBDA  (ARG  FNS) 

(C  (LAMBDACATE 

(LIST  ( CONTINUAT IONXVAR  FN)) 
(CONTINUATIONXTVARS  FN) 

( CONTINUAT IONXDEP  FN) 

(LIST  ARG) 

(REMARK-ON  (RETURNXCONT  CFM)) 

•**ENV** 

BOOY) 

FNS))))) 

(ERROR  ' |Non-triviel  Function  In  ANALYZE -RE TURN | CNOOE  'FAIL-ACT))))) 
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LOOKUPICATE  (I  make  no  apology  for  the  choice  of  the  name  of  this  or  any 
other  function;  suffice  it  to  say  that  a function  named  LOOKUP  already  existed 
in  the  SCHEME  interpreter)  takes  a variable  name  VAR  and  a rename  list  RNL,  and 
returns  a piece  of  MacLISP  code  for  referring  to  that  variable.  If  an  entry  is 
in  RNL  for  the  variable,  that  entry  contains  the  desired  code.  Otherwise  a 
global  variable  reference  must  be  constructed.  This  will  simply  be  a reference 
to  the  MacLISP  variable,  unless  it  is  the  name  of  a TR1VFN.  In  this  case  a GETL 
form  is  constructed.  (This  is  a big  kludge  which  does  not  always  work,  and  is 
done  this  way  as  a result  of  a rather  unclean  hack  in  the  SCHEME  interpreter 
which  interfaces  MacLISP  functions  with  SCHEME  functions.) 


CONS-CLOSEREFS  constructs  a piece  of  MacLISP  code  which  will  cons  onto 
the  value  of  **ENV**  all  the  variables  in  the  set  CLOSEREFS.  This  is  a simple 
loop  which  uses  LOOKUPICATE  to  generate  code,  and  constructs  a chain  of  calls  to 
CONS.  For  example,  ( CONS-CLOSEREFS  '(A  B C)  NIL)  would  produce: 

(CONS  A (CONS  B (CONS  C **ENV**))) 

Notice  the  use  of  REVERSE  to  preserve  an  order  assumed  by  other  routines. 


OUTPUT-ASET  takes  two  pieces  of  code:  VARREF,  which  refers  to  a 
variable,  and  BODY,  which  produces  a value  to  be  assigned  to  the  variable.  From 
the  form  of  VARREF  a means  of  assigning  to  the  variable  is  deduced.  (This 
implies  that  OUTPUT-ASET  knows  about  all  forms  of  code  which  might  possibly  be 
returned  by  LOOKUPICATE  and,  a fortiori,  which  might  appear  in  a RNL.)  For 
example,  if  the  reference  is  (CADR  (CDDDDR  **ENV**)),  OUTPUT-ASET  would  generate 
( RPLACA  (CDR  (CDDDDR  **ENV**))  <body>). 
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(DEFINE  LOOKUP 1CATE 

(LAMBDA  (VAR  RNL ) 

((LAMBDA  (SLOT) 

(IF  SLOT  (COR  SLOT) 

(IF  (TRIVFN  VAR) 

"(GETL  '.VAR  ' (EXPR  SUBR  LSUBR) ) 

VAR))) 

(ASSQ  VAR  RNL)))) 

(DEFINE  CONS-CLOSEREFS 

(LAMBDA  (CLOSEREFS  RNL) 

(DO  ((CR  (REVERSE  CLOSEREFS)  (CDR  CR)) 

(X  •**ENV*»  "(CONS  .(LOOKUPICATE  (CAR  CR)  RNL)  .X))) 

((NULL  CR)  X)))) 

(DEFINE  OUTPUT-ASET 

(LAMBDA  (VARREF  BODY) 

(COND  ((ATOM  VARREF) 

"(SETQ  .VARREF  .BODY)) 

((EQ  (CAR  VARREF)  ’CAR) 

"(CAR  ( RPLACA  ,(CADR  VARREF)  .BODY))) 

((EQ  (CAR  VARREF)  ’CADR) 

"(CAR  (RPLACA  (CDR  .(CAOR  VARREF))  .BODY))) 

.((EQ  (CAR  VARREF)  'CADOR) 

"(CAR  (RPLACA  (CODR  .(CADR  VARREF))  .BODY))) 

((EQ  (CAR  VARREF)  ’CADDDR) 

"(CAR  (RPLACA  (CODDR  .(CAOR  VARREF))  .BODY))) 

(T  (ERROR  ’IUnknown  ASET  discipline  - OUTPUT-ASET|  VARREF  ’FAIL-ACT))))) 
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CONDI  CATE  takes  the  three  conponents  of  an  IF  conditional,  and  constructs 
a MacLISP  COND  form.  It  also  performs  a simple  peephole  optimization: 

(COND  (a  b) 

(T  (COND  (c  d)  ...))) 

becomes : 

(COND  (a  b)  (c  d)  ...) 

Also,  DEPROGNIFY  is  used  to  take  advantage  of  the  fact  that  MacLISP  COND  clauses 
are  implicitly  PROGN  forms.  Thus: 

(CONDICATE  '(NULL  X)  '(PROGN  (PRINT  X)  Y)  '(COND  ((NULL  Y)  X)  (T  FOO))) 
would  produce: 

(COND  ((NULL  X)  (PRINT  X)  Y)  ((NULL  Y)  X)  (T  FOO)) 

DECARCDRATE  is  a peephole  optimizer  which  attempts  to  collapse  CAR/CDR 
chains  in  a piece  of  MacLISP  code  to  make  it  more  readable.  For  example: 

(CAR  (CDR  (CDR  (CAR  (CDR  (CAR  (CDR  (CDR  (CDR  (CDR  X)))))))))) 

would  become: 

( CADDR  (CADR  (CADDDR  (CDR  X)))) 

The  arbitrary  heuristic  is  that  "A''  should  appear  only  initially  in  a "C...R" 
composition.  DECARCDRATE  also  knows  that  MacLISP  ordinarily  has  defined  CAR/CDR 
compositions  up  to  four  long. 
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;;;  CONDICAU  TURNS  AN  IF  INTO  A COND;  IN  SO  DOING  IT  TRIES  TO  MAKE  THE  RESULT  PRETTY. 

(DEF  INE  CONDICATE 

(LAMBDA  (PRED  CON  ALT) 

(IF  (OR  (ATOM  ALT)  (NOT  (EQ  (CAR  ALT)  'COND))) 

"(COND  ( .PREO  |(DEPROGNIFY  CON)) 

(T  KOEPROGNIFY  ALT))) 

"(COND  (.PREO  KOEPROGNIFY  CON)) 

KCOR  ALT))))) 

;;;  DECARCORATE  MAKES  CAR-COR  CHAINS  PRETTIER. 

(DEFINE  DECARCORATE 
(LAMBDA  (X) 

(COND  ((ATOM  X)  X) 

((EQ  (CAR  X)  ’CAR) 

(IF  (ATOM  (CAOR  X)) 

X 

(LET  ((Y  (OECARCORATE  (CADR  X)))) 

(CONO  ((EQ  (CAR  Y)  ’CAR)  "(CAAR  ,(CADR  Y))) 

((EQ  (CAR  Y)  ’COR)  “(CAOR  .(CADR  Y)>) 

((EQ  (CAR  Y)  ’CODR)  "(CADDR  .(CADR  Y))) 

((EQ  (CAR  Y)  ’CDDOR)  "(CADDDR  .(CADR  Y))) 

(T  "(CAR  .Y)))))) 

((EQ  (CAR  X)  ’CDR) 

(IF  (ATOM  (CADR  X)) 

X 

(LET  ((Y  (DECARCORATE  (CADR  X)))) 

(COND  ((EQ  (CAR  Y)  ’COR)  "(CDOR  .(CADR  Y))) 

((EQ  (CAR  Y)  ’CODR)  "(CDDDR  .(CADR  Y))) 

((EQ  (CAR  Y)  ’CDDDR)  "(CDDDDR  .(CADR  Y))) 

(T  "(COR  .Y)))))) 

(T  X)))) 
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TRIVIALIZE  is  the  version  of  ANALYZE  which  handles  trivial  forms.  Recall 
that  these  are  represented  as  pass-1  node-trees  rather  than  as  pass-2  cnode- 
trees.  The  task  of  TRIVIALIZE  is  to  take  such  a node-tree  and  generate  value- 
producing  code.  Recall  that  the  subforms  of  a trivial  form  must  themselves  be 
trivial . 


For  a CONSTANT,  a quoted  copy  of  the  value  of  the  constant  is  generated. 

For  a VARIABLE,  a reference  to  the  variable  is  generated  using 

LOOKUP  I CATE. 

For  an  IF,  the  components  are  recursively  given  to  TRIVIALIZE  and  then 
CONDICATF,  is  used  to  generate  a MacLISP  COND  form. 

For  an  ASET,  a reference  to  the  ASET  variable  is  generated  using 

LOOKUPICATE,  and  code  for  the  body  is  generated  by  calling  TRIVIALIZE 

recursively;  then  OUTPUT-ASET  generates  the  code  for  the  ASET. 

For  a COMBINATION,  the  function  must  be  either  a TRIVFN  or  a LAMBDA- 
expression.  For  the  former,  a simple  MacLISP  function  call  is  generated,  after 
generating  code  for  all  the  arguments.  For  the  latter,  TRIV-LAMBDACATE  is 
invoked  after  generating  code  for  the  arguments  and  the  LAMBDA  body. 


TRIV-LAMBDACATE  is,  so  to  speak,  a trivial  version  of  LAMBDACATE.  The 
arguments  are  divided  into  two  classes,  those  which  are  referenced  and  those 
which  are  not  (the  possibility  of  a referenced  argument  which  is  a KNOWN-FUNCTION 
cannot  arise).  When  this  is  done,  a MacLISP  ((LAMBDA  ...)  ...)  form  is 
generated,  preceded  by  any  unreferenced  arguments  (which  presumably  have  side- 
effects).  For  example: 

(TRIV-LAMBDACATE  '(VI  VZ  V3) 

•((CAR  X)  (PRINT  Y)  (CDR  Z)) 

' ( PROGN  (PRINT  VI)  (LIST  VI  V3))) 


ought  to  produce: 

(PROGN  (PRINT  Y) 

((LAMBDA  (VI  V3 ) 

(COMMENT  (VARS  = (AC))) 

(PRINT  VI) 

(LIST  VI  V3 ) ) 

(CAR  X) 

(CDR  Z))) 

Note  that  a MacLISP  LAMBDA  body  is  an  implicit  PROGN.  TRIV-LAMBDACATE  also  takes 
advantage  of  the  ability  to  arbitrarily  reorder  the  execution  of  arguments  to  a 
combination . 
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(DEFINE  TRIVIALIZE 

(LAMBDA  (NODE  RNL ) 

(LET  ( ( FM  (NODEXFORM  NODE))) 

(EQCASE  (TYPE  FM) 

(CONSTANT  •*  .(CONSTANTXVALUE  FH) ) 

(VARIABLE  (LOOKUPICATE  (VAR  I ABLE  WAR  FM)  RNL)) 

(IF  (CONDICATE  (TRIVIALIZE  ( IFXPRED  FM)  RNL) 

(TRIVIALIZE  (IFXCON  FM)  RNL) 

(TRIVIALIZE  ( IF\ALT  FH)  RNL))) 

(ASET 

(OUTPUT-ASET  (LOOKUPICATE  (ASETWAR  FM)  RNL) 

(TRIVIALIZE  (ASETXBOOY  FM)  RNL))) 

(COMBINATION 

(LET  ((ARGS  ( COMB  I NAT  I ONXARGS  FM))) 

(LET  ((FN  (NOOEXFORM  (CAR  ARCS)))) 

(IF  (ANO  (EQ  (TYPE  FN)  'VARIABLE) 

( VAR  I ABLE \GLOB ALP  FN) 

( TRIVFN  (VARIABLEWAR  FN))) 

(CONS  (VARIABLEWAR  FN) 

(AMAPCAR  (LAMBDA  (A)  (TRIVIALIZE  A RNL)) 

(CDR  ARGS))) 

(IF  (EQ  (TYPE  FN)  'LAMBDA) 

(TRIV-LAMBDACATE 
(LAMBDA WARS  FN) 

(AMAPCAR  (LAMBDA  (A)  (TRIVIALIZE  A RNL)) 

(CDR  ARGS)) 

(TRIVIALIZE  (LAMBDAXBODY  FN)  RNL)) 

(ERROR  ' fStrange  Trivial  Function  - TRIVIAL !ZE| 
NODE 

■FAIL-ACT)))))))))) 


(DEFINE  TRIV-LAMBOACATE 

(LAMBDA  (VARS  ARGS  BODY) 

(LABELS  ((LOOP 

(LAMBDA  (V  A REALVARS  REALARGS  EFFARGS) 

(IF  (NULL  A) 

(LET  ((RV  (NREVERSE  REALVARS))) 

(OR  (NULL  V) 

(ERROR  ' |Ve  blew  It  In  TRIV-LAMBDACATE  | V 'FAIL-ACT)) 
(LET  ((B  (IF  RV 

•((LAMBDA  .RV 

(COMMENT 

(VARS  « .(MAP-USER-NAMES  RV))) 
KDEPROGNIFY  BODY)) 
t( NREVERSE  REALARGS)) 

BOOY) ) ) 

(IF  EFFARGS 

•(PROGN  IEFFARGS  !( DEPROGNIFY  B)) 

B))) 

(IF  (OR  (GET  (CAR  V)  'READ-REFS) 

(GET  (CAR  V)  ’WRITE-REFS)) 

(LOOP  (COR  V) 

(COR  A) 

(CONS  (CAR  V)  REALVARS) 

(CONS  (CAR  A)  REALARGS) 

EFFARGS) 

(LOOP  (CDR  V) 

(COR  A) 

REALVARS 

REALARGS 

(CONS  (CAR  A)  EFFARGS))))))) 

(LOOP  VARS  ARGS  NIL  NIL  NIL)))) 
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We  have  examined  the  entire  code  generator,  and  now  turn  to  high-level 
control  routines.  COMPILATE-ONE-FUNCTION  is  the  highest-level  entry  to  the  code 
generator,  called  by  COMPILE.  It  takes  a code-tree  and  the  user-name  for  the 
function,  and  returns  a complete  piece  of  MacLISP  code  constituting  a module  for 
the  user  function.  It  generates  a global  name  for  use  as  the  module  name 
(PROGNAME),  and  invokes  COMPILATE-LOOP  (which  really  ought  to  have  been  a LABELS 
function,  but  was  too  big  to  fit  on  the  paper  that  way).  The  last  argument  is  a 
list  of  two  MacLISP  forms;  one  causes  a SCHEME  compiled  closure  form  (a  CBETA 
list)  to  be  put  in  the  value  cell  of  the  user-name,  so  that  it  will  be  a globally 
defined  SCHEME  function,  and  the  other  creates  a property  linking  the  PROGNAME 
with  the  USERNAME  for  debugging  purposes. 


COMPILATE-LOOP  repeatedly  calls  COMPILATE,  giving  it  the  next  function  on 
the  FNS  list.  Of  course,  the  invocation  of  COMPILATE  may  cause  new  entries  to 
appear  on  the  FNS  list.  COMPILATE-LOOP  iterates  until  the  FNS  list  converges  to 
emptiness.  As  it  does  so  it  takes  each  piece  of  generated  code  and  strings  it 
together  as  PROGBODY.  It  also  calculates  in  TMAX  the  maximum  over  all  MAXDEP 
slots;  this  is  the  only  place  where  the  MAXDEP  slot  is  ever  used. 

When  FNS  is  exhausted,  a module  is  constructed.  This  contains  a comment, 
a MacLISP  DEFUN  form  for  defining  a MacLISP  function,  a SETQ  form  to  put  the  SUBR 
pointer  in  the  value  cell  of  the  PROGNAME  (for  the  benefit  of  code  which  creates 
CBETA  forms),  and  extra  "stuff".  TMAX  is  used  to  generate  a list  of  all 
temporary  variables  used  in  the  module;  a MacLISP  SPECIAL  declaration  is  created 
to  advise  the  MacLISP  compiler. 


USED-TEMPLOCS  takes  a TMAX  value  and  generates  the  names  of  all  temporary 
registers  (whose  names  are  of  the  form  -nn-;  standard  argument  registers  are  not 
included)  up  to  that  number. 
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(DEFINE  COMP  I LATE  -ONE  -FUNCT  ION  ' ;COMPL  ICATE-ONE-FONCTION? 

(LAMBDA  (CNODE  USERNAME) 

(LET  ((PROGNAME  ( GEN -GLOBAL- NAME )) ) 

(COMP I LATE -LOOP  USERNAME 
PROGNAME 

(LIST  (LIST  USERNAME  CNOOE)) 

(LIST  (LIST  PROGNAME  CNODE  NIL)) 

NIL 

0 

(LIST  "(SETQ  .USERNAME 

(LIST  'CBETA 

.PROGNAME 

• .(CLAMBDA\NAME  ( CNODE \CFORM  CNODE)))) 

“( DEFPROP  .PROGNAME  .USERNAME  USER-FUNCTION)))))) 

(DEFINE  COMP  I LATE -LOOP 

(LAMBOA  (USERNAME  PROGNAME  BLOCKFNS  FNS  PROGBODY  TMAX  STUFF) 

(IF  (NULL  FNS) 

"( PROGN  'COMPILE 

(COMMENT  MODULE  FOR  FUNCTION  .USERNAME) 

(OEFUN  .PROGNAME  () 

(PROG  ( ) 

(DECLARE  (SPECIAL  .PROGNAME  KUSED-TEMPLOCS  TMAX))) 

(GO  (PROG?  NIL 

(CAR  «*ENV«») 

(SETQ  **ENV*»  (COR  *«ENV*«))>) 
f(NRE VERSE  PROGBODY))) 

(SETQ  .PROGNAME  (GET  '.PROGNAME  'SUBR)) 

ISTUFF ) 

(COMPILATE  (CAR  (CAR  FNS)) 

(CADR  (CAR  FNS)) 

(CADDR  (CAR  FNS)) 

BLOCKFNS 
(CDR  FNS) 

(LAMBDA  (CODE  NEWFNS) 

(LET  ( (CFM  ( CNODE \CFORM  (CADR  (CAR  FNS))))) 

(COMP  I LATE -LOOP 
USERNAME 
PROGNAME 
BLOCKFNS 
NEWFNS 

(NCONC  (REVERSE  (DEPR06NIFY1  CODE  T)) 

(CONS  (REMARK-ON  (CADR  (CAR  FNS))) 

(CONS  (EQCASE  (TYPE  CFM) 

(CLAMBDA 

(CLAMBDANNAME  CFM)) 
(CONTINUATION 
(CONTINUATION\NAME  CFM))) 
PROGBODY))) 

(MAX  TMAX 

(EOCASE  (TYPE  CFM) 

(CLAMBDA 

( CL AMBDA\MAXDEP  CFM)) 

(CONTINUATION 

( CONT INUAT IONNMAXDEP  CFM)))) 

STUFF))))))) 

(DEFINE  USED-TEMPLOCS 
(LAMBDA  (N) 

(DO  ((J  (♦  ••NUMBER-OF  -ARG-REGS**  1)  (+  j 1)) 

(X  NIL  (CONS  (TEMPLOC  J)  X))) 

<(>  J N)  (NRE VERSE  X))))) 
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RENARK-ON  takes  a cnode  for  a CLANBDA  or  CONTINUATION  and  generates  a 
comment  containing  pertinent  information  about  invoking  that  function.  This 
comment  will  presumably  be  inserted  in  the  output  code  to  guide  the  debugging 

process . 

MAP-USER-NANES  takes  a list  of  internal  variable  names  and  returns  a list 
of  the  corresponding  user  names  for  the  variables,  as  determined  by  the  USER-NAME 
property.  (If  a variable  is  an  internally  generated  one,  e.g.  for  a 
continuation,  then  it  will  have  no  USER-NAME  property,  and  the  internal  name 
itself  is  used. ) 
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(OEFINE  REMARK-ON 

(LAMBDA  ( CNOOE ) 

(LET  ( ( CFM  (CNOOEXCFORM  CNOOE))) 

(LABELS  ( (REMARK1 

(LAMBDA  (OEP  FNP  VARS  ENV) 

"(COMMENT  (DEPTH  « ,DEP) 

(FNP  - ,FNP ) 

P(IF  VARS  "((VARS 
l( IF  ENV  ’((ENV  • 

(EQCASE  (TYPE  CFM) 

(CLAMBDA 

( REMARK 1 (CLAMBDA\OEP  CFM) 


„ (MAP-USER-NAMES  VARS)))) 
(MAP-USER-NAMES  ENV)))))))) 


(CLAMBDA\FNP  CFM) 

(IF  (EQ  (CLAMBOA\FNP  CFM)  1 NOCLOSE) 

( CLAMBDAXTVARS  CFM) 

(CLAMBDAWARS  CFM)) 

(APPEND  (CLAMBDANCLOSEREFS  CFM) 
(CLAMBDAXCONSENV  CFM)))) 

(CONTINUATION 

(REMARK1  (CONTINUATIONNDEP  CFM) 

(CONTINUATIONXFNP  CFM) 

NIL  ; NEVER  INTERESTING  ANYWAY 

(APPEND  ( CONT INUAT IONXCLOSEREFS  CFM) 

(CONTI NUAT IONNCONSENV  CFM))))))))) 


( OEF 


INE  MAP-USER-NAMES 
(LAMBDA  (VARS) 

( AMAPCAR  (LAMBDA  (X)  (OR  (GET  X 'USER-NAME)  X))  VARS))) 
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The  next  few  pages  contain  routines  which  deal  with  files.  COMPILE  takes  , 

a file  name,  and  compiles  all  the  code  in  that  file,  producing  an  output  file  of 
MacLISP  code  suitable  for  giving  to  the  MacLlSP  compiler.  It  also  computes  the  I 

CPU  time  required  to  compile  the  file. 

FN  gets  the  given  file  name,  processed  and  defaulted  according  to 
ITS/NacLISP  standard  conventions.  RT  and  GCT  save  runtime  and  gc-runtime 
information. 

IFILE  and  OFILE  get  MacLISP  "file  objects"  created  by  the  OPEN  function, 
which  opens  the  file  specified  by  its  first  argument.  (The  output  file  names  are  J 

initially  " RABB  OUTPUT",  conforming  to  an  ITS  standard.  These  will  later  be 
renamed . ) 

*GLOBAL-GEN-PREFIX*  is  initialized  as  a function  of  the  file  name,  to 
"directory=f  irstname" . This  is  to  guarantee  that  the  global  symbols  generated  ( 

for  two  different  compiled  files  of  SCHEME  code  will  not  conflict  should  the  two  \ 

files  he  loaded  into  the  same  SCHEME  together.  (Notice  the  use  of  SYMEVAL.  This 
is  necessary  because  MacLISP  allows  names  to  be  used  in  two  different  kinds  of 
contexts,  one  meaning  the  "functional"  value,  and  the  other  meaning  the 
"variable"  value.  SCHEME  does  not  make  this  distinction,  and  tries  to  make  the 
functional  value  available,  but  does  not  do  this  consistently.  This  is  a problem 
which  results  from  a fundamental  difference  in  semantics  between  SCHEME  and 
MacLISP.  For  such  variables  as  DEFAULTF  and  TYO,  which  in  MacLISP  are  used  for 
both  purposes,  it  is  necessary  to  use  SYMEVAL  to  specify  that  the  variable, 
rather  than  the  function,  is  desired.) 

(SYMEVAL  'TYO)  refers  to  the  file  object  for  the  terminal;  this  is  used 
to  print  out  messages  to  the  user  while  the  file  is  being  compiled.  Various 
information  is  also  printed  to  the  file,  including  identification  and  a 
timestamp.  The  DECLARE  form  printed  to  the  output  file  contains  the  names  of  the 
standard  argument  registers,  and  also  **ENV**,  **FUN**,  and  **NARGS**.  (This  is 
why  USF.D-TF.MPLOCS  need  not  generate  names  of  standard  argument  registers  --  this 
single  global  declaration  covers  them.)  The  second  DECLARE  form  defines  to  the 
MacLISP  compiler  a function  called  DISPLACE  for  obscure  reasons  having  to  do  with 
the  implementation  of  SCHEME  macros. 

TRANSDUCE  does  the  primary  work  of  processing  the  input  file.  When  it  is 
done,  another  timestamp  is  printed  to  the  output  file,  so  that  the  real  time 
consumed  can  be  determined;  then  the  runtime  statistics  are  calculated  and 
printed,  along  with  the  number  of  errors  if  any  occurred.  The  output  file  is 
then  renamed  as  "f irstname  LISP"  and  closed.  The  statistics  message  is  returned 
so  that  it  will  be  printed  as  the  last  message  on  the  terminal. 
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(DEFINE  COMFILE 

(LAMBDA  (F NAME) 

(LET  ( ( FN  ( DEFAULTF  (MERGEF  FNAME  '(•  >)))) 

(RT  (RUNTIME)) 

(GCT  (STATUS  GCTIME ) ) ) 

(LET  ( ( IF  ILE  (OPEN  FN  ’IN)) 

(OFILE  (OPEN  (MERGEF  ■ ( _RABB_  OUTPUT ) FN)  ’OUT))) 

(SET'  ‘GLOBAL-GEN-PREF IX* 

(CATENATE  (CADAR  (SYMEVAL  'DEFAULTF)) 

'1*1 

(CADR  (SYMEVAL  ‘DEFAULTF)))) 

(LET  ( ( TN  (NAMESTRING  ( TRUENAME  IF ILE ) ) ) ) 

(PRINT  "(COMMENT  THIS  IS  THE  RABBIT  LISP  COOE  FOR  .TN)  OFILE) 
(TIMESTAMP  OFILE) 

(TERPRI  OFILE) 

(TERPRI  (SYMEVAL  ’TYO)) 

(PRINC  ' | ;Beg1nning  RABBIT  compilation  on  | (SYMEVAL  'TYO)) 

(PR INC  TN  (SYMEVAL  'TYO))) 

(PRINT  "(DECLARE  (SPECIAL  |**CONT+ARG-REGS**  **ENV**  **FUN**  **NARGS»* ) ) 
OFILE) 

(PRINT  '(DECLARE  (OEFUN  DISPLACE  (X  Y)  Y) ) OFILE) 

(ASET'  "TESTING*  NIL) 

(ASET'  ‘ERROR-COUNT*  0) 

(ASET'  *ERROR-L 1ST*  NIL) 

(TRANSOUCE  IF ILE 
OFILE 
(LIST  NIL) 

(CATENATE  ‘|INIT-|  (CADR  (TRUENAME  IF ILE ) ) ) ) 

(TIMESTAMP  OFILE) 

(LET  ((X  (*QUO  (-  (RUNTIME)  RT)  1.0E6)) 

(Y  (*QUO  (-  (STATUS  GCTIME)  GCT)  1.0E6))) 

(LET  ((MSG  "(COMPILE  TIME:  ,X  SECONDS 

(GC  TIME  ,Y  SECONDS) 

(NET  ,(-$  X Y)  SECONDS) 

•(IF  (NOT  (ZEROP  ‘ERROR-COUNT*  ) ) 

"((. ‘ERROR -COUNT*  ERRORS)))))) 

(PRINT  "(COMMENT  .MSG)  OFILE) 

( RENAMEF  OFILE 

(MER6EF  (LIST  (CADR  FN)  ’LISP) 

FN)) 

(CLOSE  OFILE) 

MSG)))))) 
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TRANSDUCE  processes  forms  from  IFILE,  one  by  one,  calling  PROCESS-FORM  to 
do  the  real  work  on  each  one.  PROCESS-FORM  may  print  results  on  OFILE,  and  may 
also  return  a list  of  "random  forms"  to  be  saved. 

The  business  of  "random  forms"  has  to  do  with  the  fact  that  the  file 
being  compiled  may  contains  forms  which  are  not  function  definitions.  The 
expectation  is  that  when  the  file  is  loaded  these  forms  will  be  evaluated  during 
the  loading  process,  and  this  is  indeed  true  if  the  interpreter  loads  the 
original  file  of  source  forms. 

Now  MacLISP  provides  a facility  for  evaluating  random  forms  within  a 
compiled  file,  but  they  are  evaluated  as  MacLISP  forms,  not  SCHEME  forms.  To  get 
around  this  whole  problem,  I chose  another  solution.  All  the  random  forms  in  the 
file  are  accumulated,  and  then  compiled  as  the  body  of  a function  named  "1NIT- 
firstname".  In  this  way,  once  the  compiled  code  is  loaded,  the  user  is  expected 
to  say  ( INIT-f irstname)  to  cause  the  random  forms  to  be  evaluated. 

This  idea  would  have  worked  pe. fectly  except  that  files  typically  have  a 
large  number  of  random  forms  in  them  (macro  definitions  create  one  or  two  random 
forms  as  well  as  the  definition  of  the  macro-function).  Putting  all  the  random 
forms  together  in  a single  function  often  creates  a function  too  big  for  RABBIT 
to  compile,  given  PDP-10  memory  limitiations . The  four  lines  of  code  in 
TRANSDUCE  for  this  have  therefore  been  commented  out  with  a at  the  beginning 
of  each  line. 

The  final  solution  was  to  compile  each  random  form  as  its  own  function, 
and  arrange  for  all  these  little  functions  to  be  chained;  each  one  executes  one 
random  form  and  then  calls  the  next.  A call  to  INIT-f  irstname  starts  the  chain 
going. 


This,  then,  is  the  purpose  of  the  big  DO  loop  in  TRANSDUCE:  to  construct 
all  the  little  functions  for  the  random  forms.  The  third  argument  to  PROCESS- 
FORM  may  be  NIL,  which  suppresses  the  printing  of  any  messages  on  the  terminal; 
this  spares  the  user  having  to  see  tens  or  hundreds  of  uninteresting  messages 
concerning  the  compilation  of  these  initialization  functions.  However,  so  that 
the  user  will  not  be  dismayed  at  the  long  pause,  a message  saying  how  many  random 
forms  there  were  is  printed  first. 


READIFY  implements  the  MacLISP  convention  that  if  the  value  of  the 
variable  READ  is  non-nil,  then  that  value  is  the  read-in  function  to  use;  while 
if  it  is  NIL,  then  the  function  READ  is  the  read-in  function.  (This  "hook"  is 
the  method  by  which  CGOL  works,  for  example.) 


001 

002 

003 

004 

005 

006 
00  2 
008 

009 

010 

011 

012 

013 

014 

015 

016 

017 

018 

019 

020 
021 
022 

023 

024 

025 

026 

027 

028 

029 

030 

031 

032 

033 

034 

035 

036 

037 

038 

039 

040 

041 

042 

043 

044 

045 

046 


RABBIT  568  05/15/76  P«ge  67 

(DEFINE  TRANSDUCE 

(LAMBDA  (IFILE  OF  HE  EOF  INITNAME) 

(LABELS  ((LOOP 

(LAMBDA  (FORM  RANDOM-FORMS) 

(IF  (EQ  FORM  EOF) 

(DO  ((X  (6ENTEMP  INITNAME)  (6ENTEMP  INITNAME)) 

(Y  NIL  X) 

(Z  RANDOM- FORMS  (CDR  Z))) 

((NULL  Z) 

(IF  RANDOM-FORMS 

(PRINT  *(, (LENGTH  RANDOM-FORMS) 

RANDOM  FORMS  IN  FILE  TO  COMPILE) 

(SYMEVAL  'TYO))) 

(IF  Y (PROCESS-FORM  '(DECLARE  (SPECIAL  ,Y)) 

OFILE 

T)) 

(PROCESS-FORM  '(DEFINE  , INITNAME 

(LAMBDA  ()  .(IF  Y (LIST  Y)  NIL))) 

OFILE 

T)) 

(IF  Y (PROCESS-FORM  "(DECLARE  (SPECIAL  ,Y)) 

OFILE 

NIL)) 

(PROCESS-FORM  '(DEFINE  ,X 

( LAMBOA  ( ) 

(BLOCK  .(CAR  Z) 

.(IF  Y 

(LIST  Y) 

mi)))) 

OFILE 

mi)) 

; (PROCESS-FORM 

I '(DEFINE  .INITNAME 

; (LAMBDA  ()  (BLOCK  *R ANDOH - F ORMS  NIL  NIL))) 

; OFILE) 

(LET  ((X  (PROCESS-FORM  FORM  OFILE  T) ) ) 

(LOOP  (READIFY  IFILE  EOF)  (NCONC  X RANDOM- FORMS )))))) ) 
(LOOP  (READIFY  IFILE  EOF)  NIL)))) 


(DEFINE  READIFY  ; FUNNY  MACLISP  CONVENTION  - READIFY' LL  DO  THE  JOB! 

(LAMBDA.  (IFILE  EOF) 

(IF  (SYMEVAL  'READ) 

(APPLY  (SYMEVAL  'READ)  IFILE  EOF) 

(READ  IFILE  EOF)))) 
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PROCESS-FORM  takes  a form,  an  output  file,  and  a switch  saying  whether  to 
be  "noisy".  The  form  is  broken  down  into  one  of  many  special  cases  and  processed 
accordingly.  The  returned  value  is  a list  of  "random  forms"  for  TRANSDUCE  to 
save  for  later  handling. 

An  atom,  while  pretty  useless,  is  transduced  directly  to  the  output  file. 

A DEFINE-form,  which  defined  a function  to  be  compiled,  is  given  to 
PROCESS-DEFINE-FORM.  This  is  the  interesting  case,  which  we  will  discuss  on  the 
next  page. 

A special  hack  handed  down  from  MacLISP  is  that  a form  (PROGN  'COMPILE 
...)  (and,  for  SCHEME,  the  analogous  (BLOCK  'COMPILE  ...))  should  be  treated  as 
if  all  the  subforms  of  the  PROGN  (or  BLOCK)  after  the  first  should  be  processed 
as  if  they  had  been  read  as  "top-level"  forms  from  the  file.  This  allows  a macro 
call  to  generate  more  than  one  form  to  be  compiled,  for  example.  It  is  necessary 
to  accumulate  all  the  results  of  the  calls  to  PROCESS-FORM  so  that  they  may  be 
collectively  returned. 

A PROCLAIM  form  contains  a set  of  forms  to  be  evaluated  by  RABBIT  at 
compile  time.  The  evaluation  is  accomplished  by  constructing  a LAMBDA  form  and 
using  the  SCHEME  primitive  ENCLOSE  to  create  a closure,  and  then  invoking  the 
closure.  As  a special  service,  the  variable  OFILE  is  made  apparent  to  the 
evaluated  form  so  that  it  can  print  information  to  the  output  file  if  desired. 

A DECLARE  form  is  meant  to  be  seen  by  the  MacLISP  compiler,  and  so  it  is 
passed  on  directly. 

A COMMENT  form  is  simply  eliminated.  (It  could  be  passed  through 
directly  with  no  harm.) 

A DF.FUN  form  is  passed  directly,  for  compilation  by  the  MacLISP  compiler. 

A form  which  is  a macro  call  is  expanded  and  re-processed.  As  a special 
hack,  those  which  are  calls  to  DEFMAC,  SCHMAC,  or  MACRO  are  also  evaluated 
(MacLISP  evaluation  serves),  so  that  the  defined  macro  will  be  available  for 
compiling  calls  to  it  later  in  the  file. 

Any  other  form  is  considered  "random",  and  is  returned  to  TRANSDUCE 
provided  *BUFFER-RANDOM-FORMS*  is  non-NIL.  This  switch  is  provided  in  case  it  is 
necessary  to  force  a random  form  (e.g.  an  ALLOC  form)  to  be  output  early  in  the 
file.  In  this  case  any  random  forms  must  be  MacLISP-evaluable  as  well  as  SCHENE- 
evaluahle.  (This  requirement  is  the  reason  RABBIT  has  random  forms  like  "(SET* 
FOO  ...)";  SETQ  is  unacceptable  to  SCHEME,  while  ASET'  is  unacceptable  to 
MacLISP,  but  SET'  happens  to  work  in  both  languages  for  setting  a global 
variable.)  RABBIT  itself  sets  *BUFFER-RANDOM-FORMS*  to  NIL  on  page  1 in  a 
PROCLAIM  form. 


(SET*  ‘OPTIMIZE*  T) 

(SET*  ‘BUFFER-RANDOM-FORMS*  T) 
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(OEFINE  PROCESS-FORM 

(LAMBDA  (FORM  OFILE  NOISYP) 

(COHO  ((ATOM  FORM) 

(PRINT  FORM  OFILE) 

NIL) 

((EQ  (CAR  FORM)  ’DEFINE) 

(PROCESS-OEFINE-FORM  FORM  OFILE  NOISYP) 

NIL) 

( ( ANO  (MEMO  (CAR  FORM)  '(BLOCK  PROGN)) 

(EQUAL  ( CADR  FORM)  ’’COMPILE)) 

(DO  ((F  (CDDR  FORM)  (CDR  F)) 

(Z  NIL  ( NCONC  Z (PROCESS-FORM  (CAR  F)  OFILE  NOISYP)))) 
((NULL  F)  Z))) 

((EQ  (CAR  FORM)  ’PROCLAIM) 

(AMAPC  (LAMBDA  (X)  ((ENCLOSE  '(LAMBDA  (OFILE)  .X))  OFILE)) 
(COR  FORM)) 

NIL) 

((EQ  (CAR  FORM)  ’DECLARE) 

(PRINT  FORM  OFILE) 

NIL) 

((EQ  (CAR  FORM)  ’COMMENT) 

NIL) 

((EQ  (CAR  FORM)  ’DEFUN) 

(PRINT  FORM  OFILE) 

NIL) 

((AND  (ATOM  (CAR  FORM)) 

(EO  (GET  (CAR  FORM)  ' AINT)  ’AMACRO) 

(NOT  (EQ  (GET  (CAR  FORM)  ’AMACRO)  ' AFSUBR) ) ) 

(IF  (MEMO  (CAR  FORM)  ' (DEFMAC  SCHMAC  MACRO)) 

(EVAL  FORM)) 

(PROCESS-FORM  (MACRO-EXPAND  FORM)  OFILE  NOISYP)) 

(T  (COND  (‘BUFFER-RANDOM-FORMS*  (LIST  FORM)) 

(T  (PRINT  FORM  OFILE)  NIL)))))) 
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PROCESS-DEFINE-FORM  disambiguates  the  three  DEFINE  formats  permitted  in 

SCHEME: 

(DEFINE  FOO  (LAMBDA  (X  Y)  ...)) 

(DEFINE  FOO  (X  Y)  ...) 

(DEFINE  (FOO  X Y)  . . . ) 

and  constructs  an  appropriate  LAMBDA-expression  in  standard  form. 

PROCESS-DEFINITION  takes  this  LAMBDA-expression  and  compiles  it,  after 
some  error  checks.  It  then  cleans  up,  and  if  desired  prints  a message  on  the 
console  to  the  effect  that  the  function  was  compiled  successfully. 


CLEANUP  is  used  to  clear  out  garbage  left  around  by  the  compilation 
process  which  is  no  longer  needed  (but  is  useful  during  the  compilation,  whether 
for  compilation  proper  or  only  for  debugging  should  the  compilation  process 
fail) . 


REPLACE  has  to  do  with  macros  which  displace  calls  to  them  with  the 
expanded  forms,  but  retain  enough  information  to  undo  this.  REPLACE  undoes  this 
and  throws  away  the  saved  information.  (The  DISPLACE  facility  is  normally  turned 
off  anyway,  and  is  used  only  to  make  the  compiler  run  faster  when  it  itself  is 
being  run  under  the  SCHEME  interpreter.  This  was  of  great  use  when  RABBIT  wasn't 
running  well  enough  to  compile  itself!) 

CiF.NFLUSH  removes  from  the  MacLISP  OBARRAY  all  the  temporary  generated 
symbols  created  by  GENTEMP. 

The  MAPATOMS  form  removes  from  every  atom  on  the  OBARRAY  the  properties 
shown.  This  takes  more  time  but  less  space  than  recording  exactly  which  atoms 
had  such  properties  created  for  them. 


001 

(DEFINE 
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002 

PROCESS-OEFINE-FORM 

003 

(LAMBDA  (FORM  OFILE  NOISYP) 

004 

(COND  ((ATOM  (CADR  FORM)) 

OOS 

(PROCESS-DEFINITION  FORM 

006 

OFILE 

007 

NOISYP 

008 

(CADR  FORM) 

009 

(IF  (NULL  (COODR  FORM)) 

010 

( CADDR  FORM) 

Oil 

‘(LAMBDA  .(CADDR  FORM) 

012 

(BLOCK  . , (CDDDR  FORM)))))) 

013 

(T  (PROCESS-DEFINITION  FORM 

014 

OFILE 

015 

NOISYP 

016 

(CAADR  FORM) 

017 

‘(LAMBDA  , (CDADR  FORM) 

018 

(BLOCK  . . (CDDR  FORM)))))))) 

019 

020 

(OEFINE 

PROCESS-DEFINITION 

021 

(LAMBDA  (FORM  OFILE  NOISYP  NAME  LAMBDA-EXP) 

022 

(COND  ((NOT  (EQ  (TYPEP  NAME)  ’SYMBOL)) 

023 

(WARN  (Function  Name  Not  SYMBOL | NAME  FORM)) 

024 

((OR  (NOT  (EQ  (CAR  LAMBOA-EXP)  ’LAMBDA)) 

025 

(AND  (ATOM  (CADR  LAMBDA-EXP)) 

026 

(NOT  (NULL  (CADR  LAMBDA-EXP))))) 

027 

(WARN  | Malformed  LAMBOA-expre*tion|  LAMBDA-EXP  FORM)) 

028 

(T  (PRINT  (COMPILE  NAME 

029 

LAMBDA -EXP 

030 

NIL 

031 

•OPTIMIZE*) 

032 

OFILE) 

033 

(CLEANUP) 

034 

(IF  NOISYP 

035 

(PRINT  (LIST  NAME  ’COMPILED) 

036 

(SYMEVAL  ’TYO))))))) 

037 

036 

(DEFINE 

CLEANUP 

039 

(LAMBDA  () 

040 

(BLOCK  (REPLACE) 

041 

(6ENFLUSH) 

042 

(MAPATOMS  ’(LAMBDA  (X) 

043 

(REMPROP  X ’READ-REFS) 

044 

(REMPROP  X 'WRITE -REFS) 

045 

(REMPROP  X ' NOOE ) 

046 

(REMPROP  X ’BINDING) 

04  7 

(REMPROP  X ’USER-NAME) 

048 

(REMPROP  X 'KNOWN -FUNCTION) 

049 

(REMPROP  X ' EASY- LABELS -FUNCTION) )) ) ) ) 
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SEXPRFY  and  CSEXPRFY  are  debugging  aids  which  take  a node-tree  or  cnode- 
tree  and  produce  a fairly  readable  S-expression  version  of  the  code  it 
represents.  They  are  used  by  the  SX  and  CSX  macros  defined  earlier.  The  USERP 
switch  for  SEXPRFY  specifies  whether  internal  variables  names  or  user  variable 
names  should  be  used  in  the  construction. 
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;;;  INVERSE  OF  ALPHATIZE.  USED  BY  SX.  E.6.,  FOR  DEBUGGING. 

(DEFINE  SEXPRFY 

(LAMBDA  (NODE  USERP) 

(LET  ((FM  (NODEXFORM  NODE))) 

(EQCASE  (TYPE  FM) 

(CONSTANT  "(QUOTE  ,(CONSTANT\VALUE  FM))) 

(VARIABLE  (IF  (ANO  USERP  (NOT  ( VAR  I ABLE  XGLOB ALP  FM))) 

(GET  (VARIABLEWAR  FM)  'USER-NAME) 

( VARIABLENVAR  FM))) 

(LAMBDA  "(LAMBDA  .(IF  USERP  ( LAMBOAXUVARS  FM)  (LAMBDAXVARS  FM)) 
.(SEXPRFY  ( LAMBDAXBOOY  FM)  USERP))) 

(IF  "(IF  .(SEXPRFY  (IFXPRED  FM)  USERP) 

.(SEXPRFY  (IFXCON  FM)  USERP) 

.(SEXPRFY  (IFXALT  FM)  USERP))) 

(ASET  "(ASET'  .(IF  (AND  USERP  (NOT  ( ASETXGLOBALP  FM))) 

(GET  (ASETXVAR  FM)  'USER-NAME) 

(ASETWAR  FM)) 

.(SEXPRFY  ( ASETXBODY  FM)  USERP))) 

(CATCH  "(CATCH  .(IF  USERP 

(SET  (CATCHXVAR  FM)  'USER-NAME) 

(CATCHXVAR  FM)) 

.(SEXPRFY  (CATCHXBODY  FM)  USERP))) 

(LABELS  "(LABELS  .( AMAPCAR  (LAMBDA  (V  D)  "(.(IF  USERP 

(GET  V ’USER-NAME) 
V) 

.(SEXPRFY  0 USERP))) 
(LABELSXF NVARS  FM) 

(LABELSXFNOEFS  FM)) 

.(SEXPRFY  (LABELSXBOOY  FM)  USERP))) 

(COMBINATION 

(AMAPCAR  (LAMBDA  (A)  (SEXPRFY  A USERP)) 

( COMB INAT ION \AR6S  FM))))))) 


(DEFINE  CSEXPRF Y 

(LAMBDA  (CNOOE ) 

(LET  ( (CFM  (CNODEXCFORH  CNOOE))) 

(EQCASE  (TYPE  CFM) 

(TRIVIAL  "(TRIVIAL  .(SEXPRFY  ( TRIVIAL \NODE  CFM)  NIL))) 
(CVARIABLE  (CVARIABLEXVAR  CFM)) 

(CLAMBDA  "(CLAMBOA  , ( CL AMBOAWARS  CFM) 

.( CSEXPRF Y (CLAMBDAXBODY  CFM)))) 

(CONTINUATION 

"(CONTINUATION  ( , (CONT INUAT IONWAR  CFM)) 

,( CSEXPRF Y (CONT INUAT lONXBODY  CFM)))) 

(CIF  "(C IF  ,( CSEXPRF Y (CIFXPREO  CFM)) 

.(CSEXPRFY  (C  IFXCON  CFM)) 

.(CSEXPSTY  (C IFNALT  CFM)))) 

(CASET  "(CASET'  .(CSEXPRFY  (CASETNCONT  CFM)) 

.(CASETNVAR  CFM) 

.(CSEXPRFY  (CASETNBOOY  CFM)))) 

(CLABELS  "(CLABELS  .(AMAPCAR  (LAMBDA  (V  D)  '(  ,V 

.(CSEXPRFY  D))) 
( CL ABELSNF NVARS  CFM) 
(CLABELSXFNDEFS  CFM)) 

.(CSEXPRFY  (CLA6ELSNBOOY  CFM)))) 

(CC0MB1NAT10N 

(AMAPCAR  CSEXPRFY  (CCOHBINATIONNARGS  CFM))) 

(RETURN 

"(RETURN  .(CSEXPRFY  (RETURNNCONT  CFM)) 

.(CSEXPRFY  (RETURNNVAL  CFM)))))))) 
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CHECK-NUNBER-OF-ARGS  is  used  by  COMPILE  and  ALPHA-COMBINATION  to  make 
sure  that  function  calls  and  definitions  agree  on  the  number  of  arguments  taken 
by  a function.  If  a mismatch  is  detacted,  a warning  is  issued.  This  check 
frequently  catches  various  typographical  errors.  The  argument  DEFP  is  NIL  unless 
this  call  is  on  behalf  of  a definition  rather  than  a call.  The  DEFINED  property 
is  used  only  so  that  a more  comprehensive  warning  may  be  given. 


*EXPR  and  *LEXPR  are  two  special  forms  suitable  for  use  in  a PROCLAIM 
form  for  declaring  that  certain  names  refer  to  MacLISP  functions  rather  than 
SCHEME  functions.  An  example,  for  PRINT-SHORT,  occurs  on  page  1 of  RABBIT. 


DUNPIT  establishes  a simple  user  interface  for  RABBIT.  After  loading  a 
compiled  RABBIT  into  a SCHEME  run-time  system,  the  call  (DUMPIT)  initializes  the 
RABBIT,  then  suspends  the  MacLISP  environment,  with  an  argument  which  is  an  ITS 
command  for  dumping  the  core  image.  When  this  core  image  is  later  loaded  and 
resumed,  DUMPIT  prints  "FILE  NAME:"  and  reads  a line.  All  the  user  need  do  is 
typoe  a file  name  and  a carriage  return  to  compile  the  file.  When  this  is  done, 
the  call  to  QUIT  kills  the  RABBIT  job. 


STATS  prints  out  statistics  accumulated  in  the  counters  listed  in  ♦STAT- 
IC ARS*.  RESET-STATS  resets  all  the  counters. 
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(DEFINE  CHECK-NUMBER-OF-ARGS 

(LAMBDA  (NAME  NARGS  DEFP) 

(OR  (GETL  NAME  '(*LEXPR  LSUBR ) ) 

(LET  ((N  (GET  NAME  ' NUMBER-OF -ARGS )) ) 

(IF  N 

(IF  (NOT  (•  N NARGS)) 

(IF  DEFP 

(WARN  (definition  disagrees  with  earlier  use  on  number  of  ergs  | 
NAME 
NARGS 
") 

(IF  (GET  NAME  ’DEFINED) 

(WARN  |use  disagrees  with  definition  on  number  of  args| 

NAME 

NARGS 

N) 

(WARN  |two  uses  disagree  before  definition  on  number  of  args  | 
NAME 
NARGS 
N)))) 

(PUTPROP  NAME  NARGS  ' NUMBER-OF -ARGS ) ) 

(IF  DEFP  (PUTPROP  NAME  ’T  ’ DEF INED) ) ) ) ) ) 

(OEFUN  *EXPR  FEXPR  (X) 

( MAPCAR  '(LAMBDA  (Y)  (PUTPROP  Y 'T  ’*EXPR))  X)) 

( DEFPROP  *EXPR  AFSUBR  AMACRO)  (DEF PROP  »EXPR  AMACRO  AINT) 

(OEFUN  *LEXPR  FEXPR  (X) 

(MAPCAR  ’(LAMBDA  (Y)  (PUTPROP  Y ’T  ’«LEXPR))  X)) 

(DEFPROP  *LEXPR  AFSUBR  AMACRO)  (DEFPROP  *LEXPR  AMACRO  AINT) 

(DEFINE  DUMPIT 

(LAMBDA  ( ) 

(BLOCK  ( I H I T - R ABB  IT) 

(SUSPEND  ' | :PDUMP  OSK : SCHEME ; TS  RABB I T | ) 

(TERPRI) 

(PRINC  ’ |F lie  name : |) 

(COMPILE  (HEADLINE)) 

(QUIT) ) )) 

(DEFINE  STATS 

(LAMBDA  () 

( AMAPC  (LAMBDA  (VAR) 

(BLOCK  (TERPRI) 

(PR INI  VAR) 

(PRINC  ’|  ■ |) 

(PR INI  (SYMEVAL  VAR)))) 

•STAT-VARS*))) 


(DEFINE  RESET-STATS 

(LAMBDA  ()  (AMAPC  (LAMBDA  (VAR)  (SET  VAR  0))  *STAT-VARS» ) ) ) 


Symbol  Table  for: 


QUUX; RABBIT  568 


05/15/76  Page  I 


• EXPR 

• EXPR 

• EXPR 
•EXPR 
•LEXPR 
•LEXPR 
•LEXPR 
•LEXPR 


/ 

< 


> 

ACCESSFN  

ACCESSFN  

AOOPROP  

ADJOIN  

AOJUST-KNOWNFN-CENV  

AINT  

AINT  

ALPHA-ASET  

ALPHA-ATOM  

ALPHA-BLOCK  

ALPHA-CATCH  

ALPHA-COMBINATION  

ALPHA- IF  

ALPHA-LABELS  

ALPHA-LABELS-DEFN  

ALPHA -LAMBDA  

ALPHATIZE  

AMACRO  

AMACRO  

ANALYZE  

ANALYZE -CCOMB I NAT  ION  

ANALYZE-CIF  

ANALYZE-CLABELS  

ANALYZE -CL AMBOA  

ANALYZE -CON T I NU AT  ION  

ANALYZE -RE  TURN  

APPEND  

ASET  

ASK  

ASSO  

ATOM  

BI6P  

B INO  - ANALYZE  

BIND-ANAL YZE-CASET  

B IND- ANALYZE -CCOMB INAT  ION 

BIND-ANAL YZE-CIF  

BIND-ANALYZE-CLABELS  

BIND-ANALYZE-CLAMBDA  

BIND -ANALYZE -CONTINUATION 

BINO-ANALYZE-RETURN  

BIND -CCOMB I NAT  ION -ANALYZE 

CAAAAR  

CAAADR  

CAAAR  

CAAOAR  

CAADDR  

CAADR  

CAAR  

CADAAR  

CADADR  

CAOAR  


SIDE  EFFECTS  017  005 

FEXPR  071  026 

PROPERTY  071  027 

AMACRO  071  029 

AINT  071  029 

FEXPR  071  031 

PROPERTY  071  032 

AMACRO  071  034 

AINT  071  034 

SIOE  EFFECTS  017  003 

SIDE  EFFECTS  017  004 

SIOE  EFFECTS  017  006 

SIDE  EFFECTS  017  006 

SIOE  EFFECTS  017  007 

SIDE  EFFECTS  017  009 

BDEFINE  004  002 

MACLISP  MACRO  004  004 

SCHEME  FUNCTION  006  004 

SCHEME  FUNCTION  006  029 

SCHEME  FUNCTION  052  002 

PROPERTY  071  029 

PROPERTY  071  034 

SCHEME  FUNCTION  010  010 

SCHEME  FUNCTION  009  032 

SCHEME  FUNCTION  ,..  011  011 

SCHEME  FUNCTION  010  029 

SCHEME  FUNCTION  011  037 

SCHEME  FUNCTION  010  002 

SCHEME  FUNCTION  010  040 

SCHEME  FUNCTION  011  002 

SCHEME  FUNCTION  009  042 

SCHEME  FUNCTION  009  005 

PROPERTY  071  029 

PROPERTY  071  034 

SCHEME  FUNCTION  056  002 

SCHEME  FUNCTION  060  003 

SCHEME  FUNCTION  059  002 

SCHEME  FUNCTION  059  025 

SCHEME.  FUNCTION  058  023 

SCHEME  FUNCTION  058  040 

SCHEME  FUNCTION  060  033 

SIDE  EFFECTS  017  057 

0ATA  TYPE  006  042 

PDP-10  SCHEME  MACRO  003  024 

SIDE  EFFECTS  017  059 

SIOE  EFFECTS  017  045 

SIDE  EFFECTS  017  052 

SCHEME  FUNCTION  034  030 

SCHEME  FUNCTION  035  040 

SCHEME  FUNCTION  .'  037  002 

SCHEME  FUNCTION  035  030 

SCHEME  FUNCTION  036  002 

SCHEME  FUNCTION  035  002 

SCHEME  FUNCTION  035  016 

SCHEME  FUNCTION  036  033 

SCHEME  FUNCTION  037  043 

SIDE  EFFECTS  017  024 

SIOE  EFFECTS  017  025 

SIDE  EFFECTS  017  016 

SIOE  EFFECTS  017  026 

SIDE  EFFECTS  017  027 

SIOE  EFFECTS  017  017 

SIDE  EFFECTS  017  012 

SIDE  EFFECTS  017  026 

SIDE  EFFECTS  017  029 

SIDE  EFFECTS  017  016 
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CAODAR  

CAOOOR  

CAOOR  

CAOR  

CAR  

CASE!  

CATCH  

CATENATE  

CCOM8 INAT ION  

CDAAAR  

CDAAOR  

COAAR  

CDADAR 

CDADDR  

COADR  

COAR  

COOAAR  

COOADR  

COOAR  

COOOAR  

CODOOR  

COOOR  

CODR  

COR  

CENV-ANALYZE  

CENV-CCOMB  INAT ION -ANALYZE 

CENV-TRIV-ANALYZE  

CHECK -COMB INAT ION -PE FFS  . 

CHECK-NUMBER  OF-ARGS  

C IF  

CL  

CLABELS  

CLAMBDA  

CLEANUP  

CLOBBER  

CLOSE -ANALYZE  

CNAME  

CNOOE  

CNOOIFY  

COMBINATION  

COMFILE  

COMP -BODY  

COMP  I LATE  

COMP ILATE -LOOP  

COMP  I LATE -ONE -FUNCTION  .. 

COMPILE  

COMPONENT-NAMES  

CONDICATE  

CONS  

CONS-CLOSEREFS  

CONSTANT  

CONTINUATION  

CONVERT  

CONVERT-ASET  

CONVERT-CATCH  

CONVERT-COMBINATION  

CONVERT- IF  

CONVERT-LABELS  

CONVERT-LAMBDA-FM  

COPY-COOE  

COPY-NOOES  

CSEXPRF Y 

CSX  

CVARIABLE  

CXR  

oecarcdrate  

OEFINE  


SIDE  EFFECTS  017  030 

SIDE  EFFECTS  017  031 

SIDE  EFFECTS  017  019 

SIDE  EFFECTS  017  013 

SIOE  EFFECTS  017  010 

DATA  TYPE  026  039 

DATA  TYPE  008  046 

MACLISP  MACRO  002  016 

DATA  TYPE  026  047 

SIDE  EFFECTS  017  032 

SIDE  EFFECTS  017  033 

SIDE  EFFECTS  017  020 

SIDE  EFFECTS  017  034 

SIDE  EFFECTS  017  035 

SIDE  EFFECTS  017  021 

SIDE  EFFECTS  017  014 

SIDE  EFFECTS  017  036 

SIDE  EFFECTS  017  037 

SIDE  EFFECTS  017  022 

SIOE  EFFECTS  017  038 

SIDE  EFFECTS  017  039 

SIOE  EFFECTS  017  023 

SIDE  EFFECTS  017  015 

SIDE  EFFECTS  017  011 

SCHEME  FUNCTION  032  016 

SCHEME  FUNCTION  033  026 

SCHEME  FUNCTION  033  004 

SCHEME  FUNCTION  016  002 

SCHEME  FUNCTION  071  002 

DATA  TYPE  026  038 

PDP-10  SCHEME  MACRO  007  049 

DATA  TYPE  026  040 

DATA  TYPE  026  017 

SCHEME  FUNCTION  069  038 

MACLISP  MACRO  004  025 

SCHEME  FUNCTION  040  002 

MACLISP  MACRO  004  015 

DATA  TYPE  026  007 

SCHEME  FUNCTION  027  002 

DATA  TYPE  006  055 

SCHEME  FUNCTION  066  002 

SCHEME  FUNCTION  045  006 

SCHEME  FUNCTION  041  016 

SCHEME  FUNCTION  064  017 

SCHEME  FUNCTION  064  002 

SCHEME  FUNCTION  007  010 

PROPERTY  005  060 

SCHEME  FUNCTION  062  004 

SIDE  EFFECTS  017  055 

SCHEME  FUNCTION  061  011 

DATA  TYPE  006  026 

OATA  TYPE  026  036 

SCHEME  FUNCTION  027  006 

SCHEME  FUNCTION  029  002 

SCHEME  FUNCTION  029  024 

SCHEME  FUNCTION  031  014 

SCHEME  FUNCTION  026  024 

SCHEME  FUNCTION  030  006 

SCHEME  FUNCTION  026  009 

SCHEME  FUNCTION  025  002 

SCHEME  FUNCTION  025  007 

SCHEME  FUNCTION  070  036 

MACLISP  MACRO  003  029 

DATA  TYPE  026  015 

SIOE  EFFECTS  017  040 

SCHEME  FUNCTION  062  015 

ODEFINE  001  062 
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OEFMAC  

DEFTYPE  

DEFTYPE  

OELPROP  

DEPROGNIFY  

DEPROGN IF  Y1  

DEPTH-ANAL YZE  

DISPLACE  

DUMP  IT  

EFFOEF  

EFFDEF  

EFFECTLESS  

EFFECTLESS-EXCEPT-CONS  . 

EFFS-ANALYZE  

EF  F S -ANAL YZE -COMB  I NAT  ION 

EFFS-ANALYZE - IF  

EFFS-INTERSECT  

EF  FS -UN  ION  

EMPTY  

ENV-ANALYZE  

ENVCARCDR  

EO  

EQCASE  

ERASE-ALL-NODES  

ERASE -NODE  

ERASE-NODES  

F ILTER-CLOSEREFS  

FIXP  

FLOATP  

FN  

FN  

FN  

FN-SIOE- AFFECTED  

FN-S I OE -EFFECTS  

GEN -GLOBAL -NAME  

GENFLUSH  

GENTEMP  

HUNKFN  

MUNKFN  

HUNKP  

IF  

INCREMENT  

INTERSECT  

LABELS  

LAMBDA  

LAMBOACATE  

LIST  

LOOKUP ICATE  

MACRO  

MACRO- EXPAND  

MAKE -RE TURN  

MAP-USER-NAMES  

MAPANALYZE  

MEMO  

ME  TA-COM8  INAT  ION-LAMBDA 
META -COMB  I NAT  ION- TRIVFN 

META-EVALUATE  

ME TA- IF -FUDGE  

META-SUBSTITUTE  

NAME  

NAME  

NAME  

NAME  

NODE  

NODIFY  

NOT  

NULL  


(DEFINE  001  063 

(DEFINE  005  002 

MACLISP  MACRO  005  008 

SCHEME  FUNCTION  006  012 

MACLISP  MACRO  042  005 

SCHEME  FUNCTION  042  009 

SCHEME  FUNCTION  038  010 

EXPR  001  006 

SCHEME  FUNCTION  071  037 

(DEFINE  016  039 

MACLISP  MACRO  016  034 

SCHEME  FUNCTION  023  036 

SCHEME  FUNCTION  023  039 

SCHEME  FUNCTION  014  006 

SCHEME  FUNCTION  015  031 

SCHEME  FUNCTION  015  010 

SCHEME  FUNCTION  023  028 

SCHEME  FUNCTION  015  002 

SCHEME  FUNCTION  002  005 

SCHEME  FUNCTION  012  018 

SCHEME  FUNCTION  043  002 

SIDE  EFFECTS  017  044 

MACLISP  MACRO  003  032 

MACLISP  MACRO  018  005 

MACLISP  MACRO  018  004 

SCHEME  FUNCTION  018  007 

SCHEME  FUNCTION  039  030 

SIDE  EFFECTS  017  050 

SIDE  EFFECTS  017  051 

FN-S1DE-EFFECTS  016  035 

FN-SIDE-AFFECTEO  016  036 

OKAY-TO-FOLD  016  037 

PROPERTY  016  036 

PROPERTY 016  035 

SCHEME  FUNCTION  002  041 

SCHEME  FUNCTION  002  036 

SCHEME  FUNCTION  002  030 

(DEFINE  004  028 

MACLISP  MACRO  004  030 

SIDE  EFFECTS  017  049 

DATA  TYPE  008  038 

MACLISP  MACRO  002  014 

SCHEME  FUNCTION  006  039 

DATA  TYPE  008  050 

DATA  TYPE  008  034 

SCHEME  FUNCTION  054  006 

•SIDE  EFFECT5  017  056 

SCHEME  FUNCTION  061  002 

(DEFINE  001  065 

SCHEME  FUNCTION  011  026 

SCHEME  FUNCTION  028  002 

SCHEME  FUNCTION  065  028 

SCHEME  FUNCTION  057  023 

SIDE  EFFECTS  017  058 

SCHEME  FUNCTION  021  007 

SCHEME  FUNCTION  020  0 3 7 

SCHEME  FUNCTION  019  007 

SCHEME  FUNCTION  020  010 

SCHEME  FUNCTION  024  009 

MACLISP  MACRO  004  011 

ACCESS  MACRO  004  031 

COMPONENT -NAMES  005  060 

SUPPRESSED-COMPONENT-NAMES  005  061 

DATA  TYPE  008  012 

SCHEME  FUNCTION  008  059 

SIDE  EFFECTS  017  053 

SIDE  EFFECTS  017  054 
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NUMBERP  SIDE  EFEECTS  . . 

OtCAY-TO-FOLD  PROPERTY  

OUTPUT-ASF  T SCHEME  FUNCTION 

PAIRL IS  SCHEME  FUNCTION 

PASS1  - ANA*  i/E  SCHEME  FUNCTION 

PASSABLE  SCHEME  FUNCTION 

PRIN1  SIDE  EFFECTS  .. 

PRINC  SIDE  EFFECTS  .. 

PRINT  SIDE  EFFECTS  .. 

PRINT-SHORT  EXPR  

PRINT-WARNING  SCHEME  FUNCTION 

PROCESS-OEF  INE -FORM  SCHEME  FUNCTION 

PROCESS-DEFINITION  SCHEME  FUNCTION 

PROCESS-FORM  SCHEME  FUNCTION 

PROOUCE-ASET  SCHEME  FUNCTION 

PRODUCE-COMBINATION  SCHEME  FUNCTION 

PRODUCE-COMBINATION-VARIABLE  SCHEME  FUNCTION 

PRODUCE-CONTINUATION-RETURN  SCHEME  FUNCTION 

PRODUCE  - IF  SCHEME  FUNCTION 

PROOUCt -LABELS  SCHEME  FUNCTION 

PRODUCE  - LAMBDA -COMB  I NAT  ION  SCHEME  FUNCTION 

PRODUCE-RETURN  SCHEME  FUNCTION 

PRODUCE-RETURN-1  SCHEME  FUNCTION 

PRODUCE  - TR I VF  N-COMB I NAT  ION  SCHEME  FUNCTION 

PROOUCE  - TR I VF  N - COMB  I NAT  ION-CONTINUAT  ION  SCHEME  FUNCTION 
PRODUCE  - TR  I VF  N-COMB  INAT ION-CVARI  ABLE  ..  SCHEME  FUNCTION 

PSETQ-ARGS  SCHEME  FUNCTION 

PSETQ-ARGS-ENV  SCHEME  FUNCTION 

PSE TO- TEMPS  SCHEME  FUNCTION 

PSETQIFY  SCHEME  FUNCTION 

PSE TO I FY- METHOD -2  SCHEME  FUNCTION 

PSETQIFY- ME THOO - 3 SCHEME  FUNCTION 

READ  SIDE  EFFECTS  .. 

READ  IF  Y SCHEME  FUNCTION 

REANALYZE  1 SCHEME  FUNCTION 

REFD-VARS  SCHEME  FUNCTION 

REGSLIST  SCHEME  FUNCTION 

REMARK-ON  SCHEME  FUNCTION 

REMOVE  SCHEME  FUNCTION 

RESET-STATS  SCHEME  FUNCTION 

RETURN  DATA  TYPE  

RPLACA  SIDE  EFFECTS  .. 

RPLACO  SIDE  EFFECTS  .. 

RPLACX  SIDE  EFFECTS  .. 

SCHMAC  'DEFINE  

SE T-UP - ASE TVARS  SCHEME  FUNCTION 

SETDIFF  SCHEME  FUNCTION 

SETPROP  SCHEME  FUNCTION 

SEXPRFY  SCHEME  FUNCTION 

STATS  SCHEME  FUNCTION 

SUBST-CANDIDATE  SCHEME  FUNCTION 

SUPPRESSEP-COMPONENT-NAMES  PROPERTY  

SX  MACLISP  MACRO  . 

SYWBOLP  SIDE  EFFECTS  .. 

TEMPLOC  SCHEME  FUNCTION 

TERPRI  SIDE  EFFECTS  .. 

TEST-COMPILE  SCHEME  FUNCTION 

TRANSDUCE  SCHEME  FUNCTION 

TRIV-ANALYZE  SCHEME  FUNCTION 

TRIV-ANALYZE-FN-P  SCHEME  FUNCTION 

TRIV-LAMBDACATE  SCHEME  FUNCTION 

TRIVFN  SCHEME  FUNCTION 

trivial  DATA  type  

TRIVIALIZE  SCHEME  FUNCTION 

TYI  SIDE  EFFECTS  .. 

TYO  SIDE  EFFECTS  .. 


TYPE 


HUNK  ACCESS  MACRO 


017  046 

016  037 
061  017 
007  002 
007  041 
023  042 

017  061 
017  062 
017  060 
003  016 
003  005 
069  002 

069  020 
068  006 

046  003 
051  002 

051  026 

052  023 
045  042 

047  003 

048  003 

053  002 
053  021 

049  002 

049  023 

050  002 
057  002 
057  006 
057  013 
055  005 

055  012 

056  002 
017  065 
067  042 
023  002 
034  053 
044  004 
065  002 
006  047 
071  055 
026  049 
017  041 
017  042 
017  043 

001  064 
044  032 

006  058 
006  018 

070  004 

071  046 
022  006 
005  061 
003  028 
017  048 
042  029 
017  063 

007  051 
067  002 
013  012 
013  056 
063  033 

002  009 
026  013 
063  002 
017  066 
017  064 
005  006 
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TYPEP  SIDE  EFFECTS  017  047 

UNION  SCHEME  FUNCTION  006  033 

USED-TEMPLOCS  SCHEME  FUNCTION  064  059 

VARIABLE  DATA  TYPE  008  030 

WARN  MACLISP  MACRO  003  002 


i 


